Details on the web scraping scripts I’ve developed.
Purpose: This script scrapes product data from Jumia.com.ng to collect pricing information, product descriptions, and images for market research and comparison purposes.
Data Extracted: Product names, prices, product descriptions, images, ratings, and reviews.
Technologies Used: Cheerio, Node.js.
GitHub Repository: Jumia Scraper
Currently working on a SAAS app that can be used to control the parameter of the scraping, and result will be downloaded in JSON or CSV format.
Back to topPurpose: This script scrapes product listings on Konga.com, primarily focusing on gathering pricing data and availability to monitor trends in the Nigerian e-commerce space.
Data Extracted: Product names, prices, stock availability, descriptions, and images.
Technologies Used: Node.js.
GitHub Repository: Konga Scraper
Back to topPurpose: This script scrapes firm information from the DFSA public register, used to gather details for regulatory compliance and industry analysis.
Data Extracted: Firm names, addresses, registration details, status, and regulatory info.
Technologies Used: Cheerio, Node.js.
GitHub Repository: DFSA Scraper
Back to topPurpose: This script collects property listings from realestate.com.au for analysis of property market trends, comparing location and pricing data.
Data Extracted: Property details, prices, locations, sizes, and agent information.
Technologies Used: Playwright, Node.js.
GitHub Repository: Realestate.com.au Scraper
Back to topPurpose: This script is designed to collect real estate property listings from Crexi for analysis, focusing on price trends and property details across different locations.
Data Extracted: Property details, pricing, location, size, and agent information.
Technologies Used: Puppeteer, Cheerio, Node.js.
GitHub Repository: Crexi Scraper
Back to topPurpose: This script scrapes doctor information from the CPSO website for use in healthcare industry analysis and research.
Data Extracted: Profile URL, Last Name, First Name, CPSO Number, Member Status, CPSO Registration Class, Independent Practice Dates, Former Name, Gender, Languages Spoken, Education, Graduation Year, Specialties, Postgraduate Training, Practice Address, City, Postal Code, Phone, Fax, and Date of Scrape.
Technologies Used: Cheerio, Node.js.
GitHub Repository: CPSO Scraper
Back to topPurpose: This script scrapes company information from the fiduciary directory website.
Data Extracted: Company, Status, Address, Phone Number, Website, Description
Technologies Used: Cheerio, Node.js.
GitHub Repository: Bexio Fiduciary Directory Scraper
Back to topPurpose: This automated script monitors Upwork job postings in real-time, focusing on scraping jobs related to web scraping and data extraction. New job listings are instantly sent to a dedicated Telegram channel, providing live notifications for potential job opportunities.
Data Extracted: Job titles, job descriptions, budgets, number of proposals.
Technologies Used: Node.js, Telegram Bot API.
Telegram Channel: Follow live job notifications on the Web Scraping Jobs and Freelance Writing Jobs Telegram channels.
This project automates the job search process, enabling freelancers and job seekers to stay updated with the latest opportunities without the need for manual searches.
Back to top