Use this parameter to specify a bit-wise mask number used to determine which pages to discard. Enter a bit-wise mask number specifying where strings (defined in the parameter MustHaveCSVs
) must not appear for a page to be discarded. You can create the number by adding together some of the following numbers as appropriate:
URL: 1 | If you enter 1 , the connector determines whether the URL of a page contains any of the strings specified in the parameter MustHaveCSVs . If the URL contains any of these strings, the connector discards the page. |
Page header: 4 | If you enter 4 , the connector determines whether the HTML <HEAD> tag of a page contains any of the strings specified in the parameter MustHaveCSVs . If the tag contains any of these strings, the connector discards the page. |
Page content: 8 | If you enter 8 , the connector determines whether the content of a page contains any of the strings specified in the parameter MustHaveCSVs . If the content contains any of these strings, the connector discards the page. |
Case insensitive: 64 |
If you add Note that if you specify |
Before download: 128 |
If you add Note that if you specify |
Spider check cache URL: 256 |
If you enter Note that if you specify 256, you must also specify |
Valid site structure: 512 | If you enter 512 , the connector rechecks the CantHaveCSVs values for the site to ensure the site is still valid before it updates it. If you do not include this setting, then changes to these values are never checked. If the site is not valid, it is not downloaded. |
Spider strip content: 1024 | If you enter 1024 , the connector unescapes any HTML entities in downloaded pages. This can affect other functionalities, for example if the date format of a page contains HTML entities, these are removed before a date check is performed. |
Spider check content type: 2048 | If you enter 2048 , the connector checks the content type of the page for the strings specified in CantHaveCSVs before downloading it. If the content type contains any of the CantHaveCSVs strings, it is not downloaded. |
If you enter 0
, the connector does not check for MustHaveCSVs
.
Type: | Long |
Default: | 0 |
Required: | No |
Configuration Section: | TaskName or Default |
Example: | MustHaveCheck=77
|
See Also: |
|