Scrapers always check for a local copy of the target resource (using Scraper.checkForLocalRecord) before executing a scrape from an external resource. If the resource was found (and therefore no external calls made), this is set to true.
A simple, human-readble description of what is being scraped. Used for logging.
used for caching failed results, to blacklist further calls
Contains all results generated by Scraper.scrape, including recursive calls.
Stores the DOM retrieved by scraperapi
Flag indicating a sucessful scrape, set to true after non-error-throwing call to Scraper.scrape.
External url indicating the scraper's target resource.
Used to override .env settings and force-log the output of a given scraper.
Gets the local stored record corresponding to a given scraper. Should return null if no local record is found. By default, returns false (resource is always scraped).
Extracts information from a scraped resource synchronously
Prints a detailed report of local properties for a scraper, used for debugging
Simple CLI reporting tool for debugging unsuccessful scrapes
Queries scraperapi for ScraperApiScraper.url
Required .env
variables:
SCRAPER_API_KEY
: scraperapi dashboardSCRAPER_API_REQUEST_ATTEMPTS
: Times a request is allowed to fail before error
is thrownTracks the number of times a given request has failed, used to track recurring calls to this function. Should never be set if called externally.
Saves scraped, extracted, and parsed information into a local record. By default, does nothing.
the entity that was saved
Entry point for initiating an asset scrape. General scrape outline/method order:
If set to true, scrapes the external resource regardless of any existing local records
Executes Scraper.scrape on any recursive scrapes found in the initial scrape. Defaults to simply resolving an empty promise, so subclasses with no dependencies don't have to implement this function. See Scraper.scrape for more information on implementation.
the entity that was saved
When a scrape fails, add it to the blacklist, then throw the error
Scrape the genres associated with this artist
Generated using TypeDoc
Superclass for all "scrapers" leveraging scraperapi