Taiwan Historical Weather Datasets
This dataset contains historical meteorological observations measurements for the last 128 years. Each row is a measurement for a point in date time and weather station.
The origin of this dataset is available here and the list of weather station numbers can be found here.
The sources of meteorological datasets include the meteorological stations that are established by the Central Weather Administration (station code is beginning with C0, C1, and 4) and the agricultural meteorological stations belonging to the Council of Agriculture (station code other than those mentioned above):
- StationId
- MeasuredDate, the observation time
- StnPres, the station air pressure
- SeaPres, the sea level pressure
- Td, the dew point temperature
- RH, the relative humidity
- Other elements where available
Downloading the data
- A pre-processed version of the data for the ClickHouse, which has been cleaned, re-structured, and enriched. This dataset covers the years from 1896 to 2023.
- Download the original raw data and convert to the format required by ClickHouse. Users wanting to add their own columns may wish to explore or complete their approaches.
Pre-processed data
The dataset has also been re-structured from a measurement per line to a row per weather station id and measured date, i.e.
It is easy to query and ensure that the resulting table has less sparse and some elements are null because they're not available to be measured in this weather station.
This dataset is available in the following Google CloudStorage location. Either download the dataset to your local filesystem (and insert them with the ClickHouse client) or insert them directly into the ClickHouse (see Inserting from URL).
To download:
Original raw data
The following details are about the steps to download the original raw data to transform and convert as you want.
Download
To download the original raw data:
Retrieve the Taiwan weather stations
Create table schema
Create the MergeTree table in ClickHouse (from the ClickHouse client).
Inserting into ClickHouse
Inserting from local file
Data can be inserted from a local file as follows (from the ClickHouse client):
where /path/to represents the specific user path to the local file on the disk.
And the sample response output is as follows after inserting data into the ClickHouse:
Inserting from URL
To know how to speed this up, please see our blog post on tuning large data loads.
Check data rows and sizes
- Let's see how many rows are inserted:
- Let's see how much disk space are used for this table:
Sample queries
Q1: Retrieve the highest dew point temperature for each weather station in the specific year
Q2: Raw data fetching with the specific duration time range, fields and weather station
Credits
We would like to acknowledge the efforts of the Central Weather Administration and Agricultural Meteorological Observation Network (Station) of the Council of Agriculture for preparing, cleaning, and distributing this dataset. We appreciate your efforts.
Ou, J.-H., Kuo, C.-H., Wu, Y.-F., Lin, G.-C., Lee, M.-H., Chen, R.-K., Chou, H.-P., Wu, H.-Y., Chu, S.-C., Lai, Q.-J., Tsai, Y.-C., Lin, C.-C., Kuo, C.-C., Liao, C.-T., Chen, Y.-N., Chu, Y.-W., Chen, C.-Y., 2023. Application-oriented deep learning model for early warning of rice blast in Taiwan. Ecological Informatics 73, 101950. https://doi.org/10.1016/j.ecoinf.2022.101950 [13/12/2022]
