Add Python web scraper for NJC travel rates with currency extraction

- Implemented Python scraper using BeautifulSoup and pandas to automatically collect travel rates from official NJC website
- Added currency extraction from table titles (supports EUR, USD, AUD, CAD, ARS, etc.)
- Added country extraction from table titles for international rates
- Flatten pandas MultiIndex columns for cleaner data structure
- Default to CAD for domestic Canadian sources (accommodations and domestic tables)
- Created SQLite database schema (raw_tables, rate_entries, exchange_rates, accommodations)
- Successfully scraped 92 tables with 17,205 rate entries covering 25 international cities
- Added migration script to convert scraped data to Node.js database format
- Updated .gitignore for Python files (.venv/, __pycache__, *.pyc, *.sqlite3)
- Fixed city validation and currency conversion in main app
- Added comprehensive debug and verification scripts

This replaces manual JSON maintenance with automated data collection from official government source.
This commit is contained in:
2026-01-13 09:21:43 -05:00
commit 15094ac94b
84 changed files with 19859 additions and 0 deletions

View File

@@ -0,0 +1,248 @@
# Database Implementation Summary
## ✅ Completed Changes
The Government Travel Cost Estimator has been upgraded to use JSON databases for easy rate management.
## 📊 New Database Files
### 1. `data/perDiemRates.json`
**Purpose:** Stores meal allowances and incidental expense rates
**Contains:**
- Meal rates (breakfast, lunch, dinner) for all regions
- 100%, 75%, and 50% rate tiers for extended stays
- Incidental expense allowances
- Private accommodation allowances
- Metadata (effective date, version, last update)
**Regions Included:**
- Canada (provinces)
- Yukon
- Northwest Territories
- Nunavut
- Continental USA
- Alaska
- International (average rates)
**Size:** ~4KB
**Format:** JSON
**Last Updated:** October 30, 2025
**Effective Date:** October 1, 2025
---
### 2. `data/accommodationRates.json`
**Purpose:** Stores hotel and accommodation rates by city
**Contains:**
- Standard and maximum rates for 30+ major cities
- Regional default rates
- International city rates
- Currency information
- Government rate guidelines
**Cities Included:**
- **Canada:** Ottawa, Toronto, Montreal, Vancouver, Calgary, Edmonton, Winnipeg, Halifax, Quebec City, Victoria, Whitehorse, Yellowknife, Iqaluit
- **USA:** New York, Washington DC, Chicago, Los Angeles, San Francisco, Seattle, Boston, Anchorage
- **International:** London, Paris, Tokyo, Beijing, Sydney, Dubai, Brussels, Geneva
**Size:** ~6KB
**Format:** JSON
**Last Updated:** October 30, 2025
---
## 🔄 Code Changes
### Modified: `script.js`
**Changes:**
1. Removed hardcoded rate constants
2. Added database loading functions (`loadDatabases()`)
3. Added rate lookup functions (`getAllowancesForRegion()`, `getAccommodationSuggestion()`)
4. Added automatic accommodation rate suggestions
5. Updated footer to show database effective dates
6. Added error handling for database loading
**New Functions:**
- `loadDatabases()` - Async function to load both JSON files
- `getAllowancesForRegion(destinationType)` - Gets per diem rates from database
- `getAccommodationSuggestion(city, type)` - Gets suggested accommodation rates
- `updateMetadataDisplay()` - Updates footer with database dates
- `handleDestinationInput()` - Provides accommodation suggestions as user types
### Modified: `index.html`
**Changes:**
1. Added accommodation suggestion placeholder
2. Updated footer to display dynamic database dates
---
## 📚 Documentation Created
### 1. `DATABASE_UPDATE_GUIDE.md` (Detailed, ~250 lines)
Complete step-by-step instructions for updating rates including:
- When to update databases
- How to update per diem rates
- How to update accommodation rates
- JSON validation procedures
- Testing procedures
- Version control guidelines
- International rate handling
- Data integrity checklist
### 2. `DATABASE_SCHEMA.md` (Technical Reference, ~200 lines)
Technical documentation including:
- Complete JSON structure specifications
- Field definitions and data types
- Validation rules
- Lookup logic
- Example entries
- Update frequency guidelines
### 3. Updated `README.md`
Added sections:
- Database structure overview
- File listing with database files
- Link to update guide
---
## 🎯 Benefits of Database Approach
### ✅ Advantages:
1. **Easy Updates** - No code changes needed for rate updates
2. **Maintainability** - Rates separate from application logic
3. **Version Control** - Track rate changes with metadata
4. **Scalability** - Easy to add new cities and regions
5. **Flexibility** - Support for extended stay rates (future feature)
6. **Transparency** - Rates visible in human-readable JSON
7. **Validation** - Can validate rates before deployment
8. **Documentation** - Metadata embedded in database files
### 📈 Features Enabled:
- Automatic accommodation suggestions based on destination city
- Dynamic footer showing rate effective dates
- Foundation for extended stay rate reductions
- Easy addition of new cities/regions
- Currency handling per region
- Rate tier support (100%, 75%, 50%)
---
## 🔧 Maintenance Workflow
### Regular Updates (Annual)
1. Download new NJC rates (typically October 1st)
2. Open `data/perDiemRates.json`
3. Update rates and metadata
4. Validate JSON syntax
5. Test application
6. Deploy
### Ad-hoc Updates
- Add new cities to `accommodationRates.json`
- Adjust accommodation rates as needed
- Update international rates for currency changes
### Time Required
- Per diem update: ~15-30 minutes
- Accommodation update: ~5-10 minutes per city
- Testing: ~10 minutes
---
## 📦 File Structure
```
Govt Travel App/
├── index.html
├── styles.css
├── script.js
├── README.md
├── Govt Links.txt
├── DATABASE_UPDATE_GUIDE.md ← NEW
├── DATABASE_SCHEMA.md ← NEW
├── DATABASE_SUMMARY.md ← NEW (this file)
└── data/ ← NEW
├── perDiemRates.json ← NEW
└── accommodationRates.json ← NEW
```
---
## 🧪 Testing Checklist
After database updates, test:
- [ ] Application loads without errors
- [ ] Per diem rates calculate correctly for each region
- [ ] Accommodation suggestions appear for known cities
- [ ] Footer displays correct effective date
- [ ] Calculator produces accurate totals
- [ ] All destination types work
- [ ] Console shows no errors (F12)
---
## 🔒 Data Integrity
### Validation Performed:
✅ JSON syntax validated
✅ All required fields present
✅ Rate calculations verified (75% = 0.75 × 100%)
✅ Currency codes consistent
✅ Region keys match application options
✅ City keys properly formatted
✅ Metadata complete and current
---
## 📞 Support Resources
**For Rate Updates:**
- [NJC Appendix C](https://www.njc-cnm.gc.ca/directive/travel-voyage/td-dv-a3-eng.php)
- [NJC Appendix D](https://www.njc-cnm.gc.ca/directive/app_d.php?lang=en)
- [Accommodation Directory](https://rehelv-acrd.tpsgc-pwgsc.gc.ca/lth-crl-eng.aspx)
**For Technical Help:**
- See `DATABASE_UPDATE_GUIDE.md` for procedures
- See `DATABASE_SCHEMA.md` for structure reference
- Use [JSONLint](https://jsonlint.com/) for validation
---
## 🚀 Future Enhancements
Database structure is ready for:
- [ ] Extended stay rate reductions (31+, 121+ days)
- [ ] Multiple traveler support
- [ ] Historical rate comparisons
- [ ] Currency conversion
- [ ] Rate change notifications
- [ ] Automated rate imports
- [ ] API integration for real-time rates
---
## 📊 Database Statistics
**Per Diem Database:**
- 7 regions covered
- 3 meal types × 3 rate tiers = 9 rates per region
- Incidental rates for all regions
- Total: ~60 rate values
**Accommodation Database:**
- 13 Canadian cities
- 8 US cities
- 8 International cities
- 7 Regional defaults
- Total: 36 location entries
**Combined Size:** ~10KB (very lightweight)
---
**Implementation Date:** October 30, 2025
**Database Version:** 1.0
**Status:** ✅ Complete and Tested