Web Scraping Application

This project was developed to automate the process of reference search. It uses a python-based script to open and navigate sites and make pictures of them and then save them in a specialized folders. In order to make the process of using the script simpler, I rapidly prototyped a desktop app, made with Tauri. Initially, the project was meant to be made in a way to hide the python script, requiring users to import the source code.

Code — Python Script

Code — Desktop App

Tech Stack

Tauri

React

Python

Overview

Screenshot Automation

The core of the application uses a headless Chromium browser (controlled via Python and Selenium) to automatically capture high-quality scrolling screenshots of webpages. This allowed the app to consistently extract visual references without user input, which is crucial for scaling the reference collection process for designers.

Page Classification System

To help users navigate large collections of screenshots, I implemented a rule-based page categorization module. It classifies pages (e.g. Home, Contact, Project List) based on URL patterns and HTML content clues, with support for learning from new paths to improve accuracy over time.

Image Sorting and Organization

After screenshots are taken, the app automatically groups them by domain and page category. Users can browse the results in a structured folder system or through a simple interface. This makes it easy to build moodboards and design reference libraries from dozens of sites in just minutes.

Checkout

Shell

Checkout

AYO