Scrapers always check for a local copy of the target resource (using Scraper.checkForLocalRecord) before executing a scrape from an external resource. If the resource was found (and therefore no external calls made), this is set to true.
A simple, human-readble description of what is being scraped. Used for logging.
used for caching failed results, to blacklist further calls
Contains all results generated by Scraper.scrape, including recursive calls.
Array of 0 and 25 of [[Review]] instances per page
Stores the DOM retrieved by scraperapi
Flag indicating a sucessful scrape, set to true after non-error-throwing call to Scraper.scrape.
Number of times [[ReviewPage.scrapePage]] has failed for current [[ReviewPage.currentPage]]
External url indicating the scraper's target resource.
Review page URL, without a page number. Example:
https://rateyourmusic.com/collection/frenchie/r0.0-5.0/
Used to override .env settings and force-log the output of a given scraper.
Gets the local stored record corresponding to a given scraper. Should return null if no local record is found. By default, returns false (resource is always scraped).
Simple CLI reporting tool for debugging unsuccessful scrapes
Queries scraperapi for ScraperApiScraper.url
Required .env
variables:
SCRAPER_API_KEY
: scraperapi dashboardSCRAPER_API_REQUEST_ATTEMPTS
: Times a request is allowed to fail before error
is thrownTracks the number of times a given request has failed, used to track recurring calls to this function. Should never be set if called externally.
Saves scraped, extracted, and parsed information into a local record. By default, does nothing.
the entity that was saved
Entry point for initiating an asset scrape. General scrape outline/method order:
If set to true, scrapes the external resource regardless of any existing local records
When a scrape fails, add it to the blacklist, then throw the error
Scrapes the review page for a user.
Scrape the genres associated with this artist
Generated using TypeDoc
Manages the scraping and storage of all review pages for a single Rate Your Music user.
This class utilize the scraperapi, but unlike other RYM scrapers, has no one-to-one relationship to a single database entity. RymScraper's data flow necessitates this relationship, so ReviewPageScraper extends ScraperApiScraper instead of RymScraper.