ChatGPT has been a lifesaver when I had to get some data for my articles stored in a database, but ChatGPT is only effective when trained. This tutorial will walk you through:
- Prompting ChatGPT to help you describe the structure
- Writing a prompt to help you interpret the data
- Handling errors
The final prompt has this structure:
I have a [database type] with a table called [table name]. I want you to help me find information from the database.
Here's the columns, data type, description and examples.
[question]
[paste column description]
Here’s how to build the prompt.
Step 1: Get information about the database
Describing the database’s structure requires you to find out the columns, data types and their definitions.
To create this tutorial, I’m using some flight data from the online database sandbox SQLize so you don’t have to set anything up. Go to the dropdown labelled “SQL Server” and pick “PostgreSQL 10 Bookings (Read Only)”.
In the absence of knowing what’s in the table, I generally like to ask for 100 rows of all columns. The first query will include these bits of information:
- Database type (e.g. MySQL? SQLite? PostgreSQL?)
- Table name (in this case, “flights”)
- Request (“show 100 rows with all columns”)
I have a PostgreSQL 10 database. I know the table is called "flights". Can you show me 100 rows with all columns?

SELECT *
FROM flights
LIMIT 100;
Copy the query and run it in SQLize and you’ll get this output on the right:

Click to see sample data
| flight_id | flight_no | scheduled_departure | scheduled_arrival | departure_airport | arrival_airport | status | aircraft_code | actual_departure | actual_arrival |
|---|---|---|---|---|---|---|---|---|---|
| 1185 | PG0134 | 2017-09-10 06:50:00+00 | 2017-09-10 11:55:00+00 | DME | BTK | Scheduled | 319 | [null] | [null] |
| 3979 | PG0052 | 2017-08-25 11:50:00+00 | 2017-08-25 14:35:00+00 | VKO | HMA | Scheduled | CR2 | [null] | [null] |
| 4739 | PG0561 | 2017-09-05 09:30:00+00 | 2017-09-05 11:15:00+00 | VKO | AER | Scheduled | 763 | [null] | [null] |
| 5502 | PG0529 | 2017-09-12 06:50:00+00 | 2017-09-12 08:20:00+00 | SVO | UFA | Scheduled | 763 | [null] | [null] |
Next, find out the data type of each column. I asked ChatGPT “are you able to tell me the data type for each column?” and it gave me this query:
SELECT
column_name,
data_type,
is_nullable
FROM information_schema.columns
WHERE table_name = 'flights'
ORDER BY ordinal_position;
I ran that and got this:
| column_name | data_type | is_nullable |
|---|---|---|
| flight_id | integer | NO |
| flight_no | character | NO |
| scheduled_departure | timestamp with time zone | NO |
| scheduled_arrival | timestamp with time zone | NO |
| departure_airport | character | NO |
| arrival_airport | character | NO |
| status | character varying | NO |
| aircraft_code | character | NO |
| actual_departure | timestamp with time zone | YES |
| actual_arrival | timestamp with time zone | YES |
Step 2: How to describe columns for ChatGPT to understand your table
Study the output and describe each column to create the heart of your prompt. I use this format:
[column name] : [data_type], [description OR distinct data points] [examples]
So here are the columns, and how I’d describe them:
flight_id : integer, a serial number (example: 1185, 3979, 4739, 5502)
flight_no: character, follows flight number conventions, 2 letters and usually 4 to 5 digits (example: PG0134, PG0052, PG0561)
scheduled_departure: timestamp with time zone (example: 2017-09-05 09:30:00+00)
scheduled_arrival: timestamp with time zone (example: 2017-09-05 11:15:00+00)
departure_airport: character, IATA codes consisting of 3 letters (example: DME, VKO, SVO)
arrival_airport: character, IATA codes consisting of 3 letters (example: BTK, HMA, AER)
aircraft_code: character, usually 3 character (e.g. 319, CR2, 763)
status: *... see section below ...*
actual_departure: timestamp with time zone, may be NULL (example: 2017-09-05 11:15:00+00)
actual_arrival: timestamp with time zone, may be NULL (example: 2017-09-05 11:15:00+00)
Describing the ‘status’ column
When I reached the column status, I realized there probably aren’t too many distinct values in the column, so I decided to find out all the values and put them in the prompt.
Again, I prompted ChatGPT, “there’s a column called “status”, tell me its distinct values?”

Add the unique values to the prompt, like this
status: character varying. Distinct values: "Departed", "Arrived", "On Time", "Cancelled", "Delayed", "Scheduled"
Step 3: Building the rest of the prompt
The next step involves building the head of the prompt by telling ChatGPT your database system (PostgreSQL 10) and your table’s name (flights).
I have a PostgreSQL 10 database with a table called "flights". I want you to help me find information from the database.
Here's the columns, data type, description and examples.
[insert question here if free tier, see below]
[paste column description from above]
If you are subscribed to a paid tier, consider creating your own GPT with the prompt. If you’re using the free tier, you can paste the whole prompt in with your question.

And the output:
| status | count |
|---|---|
| Arrived | 16707 |
| Scheduled | 15383 |
| On Time | 518 |
| Cancelled | 414 |
| Departed | 58 |
| Delayed | 41 |
Query: Most common planes departing from DME
Next: let’s use the ChatGPT query “which are the top 3 most common airplane_codes with departure_airport being DME”
I deliberately used actual column names in my prompt (airplane_codes, departure_airport) to avoid misinterpretation.
SELECT
aircraft_code,
COUNT(*) AS flight_count
FROM
flights
WHERE
departure_airport = 'DME'
GROUP BY
aircraft_code
ORDER BY
flight_count DESC
LIMIT 3;
Output:
| aircraft_code | flight_count |
|---|---|
| SU9 | 994 |
| CR2 | 829 |
| 321 | 427 |
You can find out what aircraft these are using this Wikipedia table.
Handling errors
Error? Copy it back into the same ChatGPT prompt and use its updated query.
If that doesn’t work, you’d have to examine whether you’ve described the column incorrectly.
Also, try feeding ChatGPT more examples. You can do this effectively by uploading the raw output from the 100-sample test done as the first step.
How this can help you
The prompt above has helped me get data much faster, rather than requesting it through a third party.
Once you’ve structured your prompt this way, you’ll start to see ChatGPT consistently give you correct, efficient SQL queries even if you don’t have deep SQL knowledge yourself.
You can iterate as much as you want, and have different analyses just by speaking plain English to ChatGPT. Plus the technique above is repeatable on any table.
More AI tutorials
Have a project in mind?
Websites. Graphics. SEO-oriented content.
I can get your next project off the ground.
See how I have helped my clients.