*** Updated file is downloadable here: Download File ***
However, this can be very useful at it basically enables to query a source of data that may be not accessible by a permanent URL or a logically build variable URL (acronym for Uniform Resource Locator by the way). I like to define it as a kind of web-based SQL (acronym for Structured Query Language).
This is not limited to query data (GET) but can be used to send data (POST). Ultimately, it is what we have been doing on our spreadsheet to send Twitter status updates (Tweets) from Excel and VBA.
Looking back at it, it’s quite impressive I managed to deliver this as 64 bits HMAC AHA1 encoding is still a nebulous. I think the next step will be to send pictures attached to the Tweet, but first things first!
Our problem was triggered by this blog post:
The following web page of is giving access to the two latest monthly USDA WASDE reports, with a permanent link and a systematically and logistically dated filename.
All the other previous reports are stored somewhere else, in a historical database:
However, there is an issue, in this database, the links to access the files aren’t built logically and cannot be guessed.
In other words, if you need a particular month, you need to know the part of the path in red, and unfortunately, it cannot be built logically and they aren’t known in advance. So appart from creating a manual list, ad-hoc, if we need to access those files, we’re a bit in a dead end.
And sometimes, we may need indeed access to those files. On the heatmap, it’s taking the 2 last files to basically check changes from one month to another. However, just before publication (a few days?) the latest file on the website is the one to be published (with a broken link until publishing). Therefore, the previuous is actually the current, and the ‘real’ previous has moved to the historical database. But we don’t have the address, therefore, our programm will not work as it should and will go through a error handling loop to zero variation from one month to another (far from being satisfactory).
But there is an API! Good news!
We will go through the steps to reach the following goal: get the historical database link of the previous month publication, then if the link is required, we will have it.
One need to create an account: https://usda.library.cornell.edu/
Back to the API, with this login and password, it’s easy to request an api-token with a POST request to /user_token.
Clicking ‘Tty it Out’ enables to enter login, password and execute the request.
The token required is appearing in section 201 (erased partially for obvious privacy reasons):
This unique string of 97 characters, the API token, is basically the password to access the API. The API owner know where the request is coming from and will come after you if you’re using it against their T&C… And I would not mess with the US Government! Joke aside, it’s why it’s important to never give away those tokens.
The same webpage can be used to play with the API as a test. But befre that, at the top of the page, there is an ‘Autohorize’ button (circled in red), clicking is displaying a box and the API token needs to be entered in the ‘Value’ field and then click (circled in green) ‘Authorize’ button.
We can now play with the API. We want to find a release, with the identifier (WASDE should be good enough) of the release and the date (we know the release date), we should be able to get what we’re looking for… Let’s try with the 10th of March 2020… Into the ‘Release’ section, ‘find release by identifier’. Let’s click ‘Try it out’!
Inputting the aforementioned parameters…
And ‘Execute’… Crossing fingers… Something is happening!
The two inmportant bits are the Request URL and the response. The Request URL is what we will use in VBA, and the response is what we’ll have to use. In the response, there’s the link of what we are looking for, the permanent link of the USDA WASDE for March 2020:
Then, if you can get the link in the full XML response, it will be very easy to extract the full link from the string and use it to open the file if necessary. But let’s see how we can send the request in VBA…
Back refereeing to this blog post:
We basically did the same for Twitter, it is quite complicated and with hexadecimal encoding… Will a Government agency have an easier or more complicated process? The documentation is rather poor as API publisher assume a general API communication knowledge in whatever programming language. They provide the tool and the structure of how to make the request, and user need to adjust it to the environment he’s using.
The Request URL and C-URL (meaning Client URL) gives us a huge clue: the instruction we will have to send to seems to be constructed with the identifier (‘wasde’), the date range, and the token seems to be without encoding. This is a surprise, you’d expect a Government agency to have a very high level of encryption, by principle.
Let’s just try a GET with header ‘Authorization’ ‘Bearer’ and our API Token as suggested by the C-URL…
To my astonishment, it worked. Just simple.
Variable ‘myurl’ is the url that is sent, easily computable as far as the date part is concerned: https://usda.library.cornell.edu/api/v1/release/findByIdentifier/wasde?latest=false&start_date=2020-04-09&end_date=2020-04-09
Variable ‘TokenAPI’ is the token, plain, without encryption.
Easy peasy, right? It should give the basis, thanks to the webpage, to use the API with any of the GET request. POST request are obviously a no-go area as the token won’t allow to post stuff, unless you’re say an employee specifically approved to send data.
So let’s apply this to our usually spreadsheet, the USDA WASDE Heatmap: the latest version was here, it will also be the occasion to make an update to fool proof the code:
In essence, our macro is opening the latest file available. But if the latest file available is actually already classified as previous as the latest is just pending to be published, we can’t get the real previous file link. We used to therefore to zero the move between previous and latest (we will keep that as ultimate error handling though… You never know!). But we can now call a macro that will retrieve the link of the actual previous file to open.
We know the month we need to open, we know the release date, therefore we can build the Request URL pretty easily:
A valid Token must be input in H5 and our response is published in G9 as shown in the macro.
In the response, we identify the character position in the string of “latest.xlsm” (ising the Excel formula FIND in G10, ditch everything that is on the right of it (using LEFT formula in G11) and we’ll do a quick macro to ditch characters on the left.
Starting from the right, we will keep everything until we meet the double quote that was opening the link.
And H14 is the link that can be used in our macro! There’s a bit of error handling… Just in case, for example, the user can forget to input a valid API token and the answer won’t contain the file URL an the string to shorten might actually be blank.
Our macro OldCrp needs to be amended at the start:
I have kept the initial error handling, just in case the API call is not working. But we cancel the flag (‘MissPrev’ variable) if the API has allowed us to open the required file. The API might be down, the date of the report might have changed (like for a US Government shutdown), the user might forgot to input a valid token, etc,… So too much error handling is better that not enough.
The rest of our program is not changing. If the flag shows that the previous file is missing, it zeros the movement from one month to another, else, it uses the file that was opened, regardless if it was from the API or the generic dated website link.
Obviously, with the API, there is now much cleaner way to perform the operation we require. We could query the API the same way from the today’s date back to the first of the month 2 months before (covering 3 calendar months) and use the two latest file. This would be totally fool proof. It requires to change the whole structure. The constant update of this file, created years ago, is more for educational purpose than efficiency, even if it starts to become somehow a spaghetti program! But API capabilities opens to us a world that is not short of possibilities.
I will obviously test the file in the real situation, when the USDA is moving the latest file as the one to be published. Because at the moment, the program is not using the API as there is no need. However, it must be stressed that the program IS using the API in last resort when it needs, so the user in theory MUST create an API Token, else the functionality will be bypassed. This is why the recourse to the API has been left to the ultimate situation where USDA has moved the latest file to being the one soon to be published and error handling left, as some users might not be bothered about going through the API token creation process,
Updated file is downloadable here: Download File.
Also, considering that unless a report publishing date is changed, our program will be fine and I have reactivated the screenshots exports of the tables (worksheet ‘Param’, destination path needs to be input in L5), as well as the Twitter API. Remember, I have not provided my tokens obviously! A good reminder of how to deal with the Twitter API:
The next project will be to publish on Twitter the exported screenshots. Currently, our Twitter API program only supports plain text Tweets but we’ll get over it… Hopefully!