This commit is contained in:
Jay 2026-03-24 23:46:08 -07:00
parent 1ff1207619
commit 5b85b9cfb2
6 changed files with 42 additions and 0 deletions

42
README.md Normal file
View file

@ -0,0 +1,42 @@
# Word Analyzer
Extracts text from PDFs using Apache PDFBox and analyzes word frequency with customizable filters.
<p align="center">
<img src="docs/assets/img/preview.jpg" width="75%" alt="Screenshot of program"/>
</p>
## Features
- Scans all PDF files in a given folder
- Counts and displays word frequency
- Filters results by minimum and maximum frequency
- Optional maximum file count limit
- Shows scan logs and results in separate windows
- Displays total scan time
## Requirements
- Java JRE 8 or higher
- Apache PDFBox 1.8.16 (bundled, no download needed)
## Usage
```bash
java -cp WordAnalyzer.jar:pdfbox-1_8_16.jar wordanalyzer.WordAnalyzer
```
1. Enter the folder path containing your PDF files
2. Set the minimum/maximum frequency filter
3. Optionally set a maximum file count
4. Click **Confirm Folder Path** to start the scan
5. Results will appear in the results window when the scan is complete
## Building from Source
```bash
mkdir out
javac -cp pdfbox-1_8_16.jar -d out src/wordanalyzer/WordAnalyzer.java
jar cfe WordAnalyzer.jar wordanalyzer.WordAnalyzer -C out .
```

BIN
WordAnalyzer.jar Normal file

Binary file not shown.

BIN
docs/assets/img/preview.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 53 KiB

Binary file not shown.

Binary file not shown.

Binary file not shown.