Web Scraping Projects

Details on the web scraping scripts I’ve developed.

Scraping Projects

Jumia.com.ng Scraper

Purpose: This script scrapes product data from Jumia.com.ng to collect pricing information, product descriptions, and images for market research and comparison purposes.

Data Extracted: Product names, prices, product descriptions, images, ratings, and reviews.

Technologies Used: Cheerio, Node.js.

GitHub Repository: Jumia Scraper

Currently working on a SAAS app that can be used to control the parameter of the scraping, and result will be downloaded in JSON or CSV format.

Back to top

Konga.com Scraper

Purpose: This script scrapes product listings on Konga.com, primarily focusing on gathering pricing data and availability to monitor trends in the Nigerian e-commerce space.

Data Extracted: Product names, prices, stock availability, descriptions, and images.

Technologies Used: Node.js.

GitHub Repository: Konga Scraper

Back to top

DFSA Public Register Scraper

Purpose: This script scrapes firm information from the DFSA public register, used to gather details for regulatory compliance and industry analysis.

Data Extracted: Firm names, addresses, registration details, status, and regulatory info.

Technologies Used: Cheerio, Node.js.

GitHub Repository: DFSA Scraper

Back to top

Realestate.com.au Scraper

Purpose: This script collects property listings from realestate.com.au for analysis of property market trends, comparing location and pricing data.

Data Extracted: Property details, prices, locations, sizes, and agent information.

Technologies Used: Playwright, Node.js.

GitHub Repository: Realestate.com.au Scraper

Back to top

Crexi Real Estate Scraper

Purpose: This script is designed to collect real estate property listings from Crexi for analysis, focusing on price trends and property details across different locations.

Data Extracted: Property details, pricing, location, size, and agent information.

Technologies Used: Puppeteer, Cheerio, Node.js.

GitHub Repository: Crexi Scraper

Back to top

CPSO Doctor Information Scraper

Purpose: This script scrapes doctor information from the CPSO website for use in healthcare industry analysis and research.

Data Extracted: Profile URL, Last Name, First Name, CPSO Number, Member Status, CPSO Registration Class, Independent Practice Dates, Former Name, Gender, Languages Spoken, Education, Graduation Year, Specialties, Postgraduate Training, Practice Address, City, Postal Code, Phone, Fax, and Date of Scrape.

Technologies Used: Cheerio, Node.js.

GitHub Repository: CPSO Scraper

Back to top

Bexio Fiduciary Directory Scraper

Purpose: This script scrapes company information from the fiduciary directory website.

Data Extracted: Company, Status, Address, Phone Number, Website, Description

Technologies Used: Cheerio, Node.js.

GitHub Repository: Bexio Fiduciary Directory Scraper

Back to top

Upwork Job Scraper with Telegram Notifications

Purpose: This automated script monitors Upwork job postings in real-time, focusing on scraping jobs related to web scraping and data extraction. New job listings are instantly sent to a dedicated Telegram channel, providing live notifications for potential job opportunities.

Data Extracted: Job titles, job descriptions, budgets, number of proposals.

Technologies Used: Node.js, Telegram Bot API.

Telegram Channel: Follow live job notifications on the Web Scraping Jobs and Freelance Writing Jobs Telegram channels.

This project automates the job search process, enabling freelancers and job seekers to stay updated with the latest opportunities without the need for manual searches.

Back to top