edumail-scrapper/README.md
2026-03-26 23:15:56 +09:00

1.3 KiB

eduMail Scraper

This repository contains Python tools I used to scrape school contact directories for students, alumni, staff, and professors. It also includes a fully anonymized version of the dataset (~112,000 contacts) that's safe to share, with all personally identifiable information (PII) like names, emails, phone numbers, and profile pictures removed.

Preview of the anonymized school contacts dataset

What's Inside

  • Python scripts for scraping and processing contact data
  • Anonymized dataset (out.csv)

Dataset Columns

Column Name Description
Name Full name
Email Address School email
Chat Address Outlook/Teams chat handle (same as email address)
Mobile Mobile phone number (formats may vary, such as xxx-xxx-xxxx, (xxx) xxx-xxxx, or xxxxxxxxxx)
Work Phone Office or work phone number
Job Title The person's role, such as "Professor," "Student," or "Administrator"
Department The department, program, or field the person belongs to, like "Department of Computer Science"
Office Location Office or building location, like LIB 101
Company Name of the organization, school, or employer
Profile Picture Profile photo or avatar in base64