API Documentation

The following page describes the Classes and Functions used by the Price Scraper program.

product Module

settings Module

This module defines the basic settings used throughout the application.

settings.ATTRIBUTES_TO_NAMES = {'sese_number': 'SESE SKU', 'weight': 'Weight', 'sese_category': 'SESE Category', 'sese_organic': 'SESE Organic', 'organic': 'Organic', 'price': 'Price', 'number': 'ID#', 'sese_name': 'SESE Name', 'name': 'Name'}

A dictionary mapping attributes to display names

settings.ATTRIBUTE_HEADER_ORDER = ('price', 'weight', 'number', 'name', 'organic')

The output order of each Other Company’s attributes

settings.COMPANIES_TO_PROCESS = ['sites.botanical_interests.BotanicalInterests', 'sites.fedco_seeds.FedcoSeeds', 'sites.johnny_seeds.JohnnySeeds', 'sites.high_mowing.HighMowing', 'sites.territorial.Territorial', 'sites.fruition.Fruition', 'sites.hudson_valley.HudsonValley', 'sites.seed_savers.SeedSavers']

The Other Company’s to process (by path to Class)

settings.COMPANY_HEADER_ORDER = ('fe', 'js', 'hm', 'ss', 'ts', 'fs', 'hv', 'bi')

The output order of the Other Companies (by abbrieviation)

settings.MINIMUM_NAME_MATCHING_PERCENTAGE = 36

The minimum percentage of words in common between SESE’s and Other Company’s Product names for them to be considered a match.

settings.SESE_HEADER_ORDER = ('sese_number', 'sese_organic', 'sese_name', 'sese_category')

The output order of SESE attributes

settings.WORKER_PROCESS_COUNT = 8

The number of worker processes to create when processing Products.

util Module

sites Module

The sites module contains Classes that describe the website the program will scrape. Searching and Parsing for each website is defined by a class that inherits from the BaseSite Class. One object will represent a single Product from the Other Company/Website.

Upon initialization, each Object will visit the Other Company’s website, find the Product that best matches the provided Name, Category and Organic Status and scrape the Product’s details, such as its name, organic status, model number, weight and price.

The get_company_attributes() method from each child Class can be used to retrieve the information about the Other Company’s Product.

base Module

botanical_interests Module

fedco_seeds Module