Keboola projects are gradually growing in size. Transformations are added, data is adjusted, and typically, if you don’t have good habits and a strong moral sense, there is no time to thoroughly determine whether some data can be deleted, for example. An even worse situation occurs when you inherit a project from someone else.
You can find a lot of useful information from Keboola Telemetry. However, you cannot easily find out which data is being downloaded but not used (tables are not used in any transformations or written by any writer). For this more detailed analysis, I wrote a Python script that not only provides me with basic information about the project (all buckets, components, tables) but is also able to extract the tables in storage that are not used. Are you interested in what the result looks like? Check out the sample report
To be used effectively, I have shared it in Google Colab, and you can use it for free. The source code is also available to you (under the CC-BY license), so you can download the entire script, explore the code, and customize it to your needs. Do you want to know the details about your project? So go for it.
Help
How to Find the Correct Keboola API Address
The Keboola API address depends on where your project is „hosted.“ You can easily find it in the address bar after logging into your project because it’s the introductory part of the address. So, if your address is in the following format, then the part highlighted in red (including the .com ending) is also the Keboola API address:
https://connection.north-europe.azure.keboola.com/admin/projects/12345/dashboard
Creating an API Token
You can create an API token in your project in five simple steps:
- 1
Log into your project, click on your icon in the upper right corner, and select Settings
- 2
You are now in the project settings, where you choose the API Tokens tab
- 3
In the upper right corner, simply click the + NEW TOKEN button
- 4
In the newly opened form, fill in the Description (so you know what the token is for), and switch the Components & Buckets section to Full Access. I recommend setting the token’s validity so that it „self-destructs“ after a certain period. If anyone gets to it, it will be useless to them. You can set the token’s validity in the Expires In drop-down list. Create a new token by clicking the Create button.
- 5
Now, copy the token or have it sent to your email. If you don’t save its value, you will need to update the token; the original value cannot be displayed in any way.