This is the code that adds the new domain:
If IsNewDomain Then
Dim iNewDomainID As Integer = 0
Dim objNewDomain As New Venexus.BusinessObjects.VenexusDomain
objNewDomain.AddNew()
objNewDomain.DomainName = GetDomainName(sURL)
objNewDomain.LastRequest = Now()
If IsGlobalCrawler() = False Then
Dim objSeamus As New VenexusSeamusCollection
objSeamus.Query.Where(objSeamus.Query.Url.Equal(sURL))
If objSeamus.Query.Load Then
objNewDomain.IsApproved = True
Else
objNewDomain.IsApproved = False
End If
Else
objNewDomain.IsApproved = True
End If
objNewDomain.Save()
iNewDomainID = CType(objNewDomain.DomainID, Integer)
iDomainID = iNewDomainID
End If
Dim objNewRobots As New Venexus.BusinessObjects.VenexusRobots
objNewRobots.AddNew()
objNewRobots.DomainID = iDomainID
objNewRobots.LastUpdate = Now()
objNewRobots.RobotsText = sRobots
objNewRobots.Save()
If it is NOT a GloablCrawler, it checks to see if a feed exist. If it does exist, it marks it approved. If it does not, it marks in not approved.
If it is GlobalCrawler, then it is automatically approved, always. However, the robots.txt file is also checked. If the robots.txt disallows it to be crawled, it will automatically change it to NOT approved. Maybe your cases have a robots.txt file for the domain that disallows it to be crawled?
|