# eduMail Scraper This repository contains Python tools I used to scrape school contact directories for students, alumni, staff, and professors. It also includes a fully anonymized version of the dataset (~112,000 contacts) that's safe to share, with all personally identifiable information (PII) like names, emails, phone numbers, and profile pictures removed. ![Preview of the anonymized school contacts dataset](docs/assets/img/preview.png) ## What's Inside - **Python scripts** for scraping and processing contact data - **Anonymized dataset (`out.csv`)** ## Dataset Columns | Column Name | Description | |-------------------|-------------| | Name | Full name | | Email Address | School email | | Chat Address | Outlook/Teams chat handle (same as email address) | | Mobile | Mobile phone number (formats may vary, such as xxx-xxx-xxxx, (xxx) xxx-xxxx, or xxxxxxxxxx) | | Work Phone | Office or work phone number | | Job Title | The person's role, such as "Professor," "Student," or "Administrator" | | Department | The department, program, or field the person belongs to, like "Department of Computer Science" | | Office Location | Office or building location, like LIB 101 | | Company | Name of the organization, school, or employer | | Profile Picture | Profile photo or avatar in base64 |