Monthly invoices come in as (text based) pdf files, and the hours worked (quantity) by hourly rate (price per unit) have to be entered manually in a 3th party web application by day.
In case such invoice contains hours worked over many days with multiple rates, the manual data entry becomes a very tedious and time consuming job.
An option for automation ..
Starting point is a pdf file, from which the tabular data needs to be extracted. On premise & cloud solutions exist that try to extract that tabular data (date, price, quantity) from a pdf file. One option is the open source tabular-java tool, which can be wrapped within a REST API-call and returns the tabular data in a JSON format. PDF in, JSON out. No promise, but this deserves a post on its own.
But how to inject that tabular data then in the html of the 3th party web page? There comes a local Chrome web extension into play. Such an extension offers the following functionality:
- reads a pdf file
- calls the tabular web service
- shows the returning data, and totals, for validation
- has a button to push the tabular data to the 3th party web page