Join us for a virtual meetup on Zoom at 8 PM, July 31 (PDT) about using One Time Series Database for Both Metrics and Logs 👉🏻 Register Now

Skip to content
On this page
Biweekly
July 5, 2023

Biweekly Report (Jun.19 - Jul.2) – GreptimeDB v0.3.2 release

A recap of the past two-weeks progress and changes happened on GreptimeDB.

Summary

Together with all our contributors worldwide, we are glad to see GreptimeDB making remarkable progress for the better. Below are some highlights:

  • Support incremental database backup
  • Support copy from ORC format
  • Make blob(binary) type working
  • Push all possible filters down to the execution plan of Parquet
  • Improve Influxdb v2 API compatibility

Contributor list: (in alphabetical order)

For the past two weeks, our community has been super active with a total of 81 PRs merged. 7 PRs from 5 external contributors merged successfully and lots pending to be merged. Congrats on becoming our most active contributors in the past 2 weeks:

👏 Let's welcome @sjcsjc123 and @sjmiller609 as the new contributors to join our community with their first PRs merged.

A big THANK YOU for the generous and brilliant contributions! It is people like you who are making GreptimeDB a great product. Let's build an even greater community together.

Highlights of Recent PR

Support incremental database backup

We support COPY DATABASE feature to allow incremental database backup.

You can use the following SQL to backup the whole database to a target directory:

rust
COPY DATABASE greptime.public TO '<BACKUP_DIR>' WITH (FORMAT = 'parquet', start_time='2023-06-18 00:00:00', end_time='2023-06-19 00:00:00');

Please note that:

  • the <BACKUP_DIR> must end with /. Each table will be exported to a single file with table name.
  • start_time and end_time should be a valid timestamp string like 2023-06-20 00:00:00/2023-06-20 00:00:00.000

Some future work:

  • persist last backup time to manifest so that we don't have to specify the start time every time.
  • use procedure framework to allow resumption after interruption (shutdown or crash).

Support copy from ORC format

We now support copy from ORC format. Supported types include Boolean, String, Integers (i16, i32, i64), Floats (f32, f64), Timestamp (Nanosecond) and Date. Compression includes ZLIB and ZSTD.

Notes: currently, it doesn't support the Run Length Encoding defined in ORC spec v0, the writer should use Run Length Encoding V2 instead. See orc-rust.

We were planning to support the import of ORC format data based on orc-format crate, but we found a lot of bugs in the process of supplementing unit testing. Finally, we have completely rewritten the entire rlev2 algorithm, incorporating the ORC format and adding support for decoding timestamp and date types. Additionally, we have implemented an asynchronous stream reader.

Support blob(binary) type

This pull request adds support for blob type in the database. It has passed all the necessary tests and is compatible with MySQL protocols and sqlness. It also includes a fix for issues encountered during the implementation process.

Push all possible filters down to the execution plan of Parquet

This PR pushes all possible filters down to the execution plan of Parquet to improve scan efficiency. It also coerces time range predicate data types to timestamp type in storage schemas to address #992.

Improve Influxdb v2 API compatibility

This PR mainly improves Influxdb v2 API compatibility.

Write using the curl command as shown below:

rust
curl -i -XPOST "http://localhost:4000/v1/influxdb/api/v2/write?bucket=public&precision=ms" \
--data-binary \
'monitor,host=127.0.0.1 cpu=0.1,memory=0.4 1667446797450 monitor,host=127.0.0.2 cpu=0.2,memory=0.3 1667446798450 monitor,host=127.0.0.1 cpu=0.5,memory=0.2 1667446798450'

Write using Go SDK, an example here.

These are the updates of GreptimeDB and we are constantly making progress. We believe that the strength of our software shines in the strengths of each individual community member. Thanks for all your contributions.

New Things

The Tech Preview version of GreptimeCloud has been officially released recently. This new version is a hosted Prometheus solution based on the powerful Serverless DBaaS architecture. This update brings wider compatibility, Prometheus rules managed with GitOps principles, and enhanced visualization tools. We sincerely invite you to explore GreptimeCloud, and contact us to offer valuable suggestions via Slack, send an email to [email protected] or discuss anything related to Greptime on GitHub discussions.

We have released GreptimeDB v0.3.2, in this version we support TWCS compression strategy inspired by Cassandra, also support compilation for CentoOS 7 machines with only glibc 2.17, and row filter push down feature and so on. Stay tuned to our latest news and download to try GreptimeDB v0.3.2!

GreptimeDB has passed over 66.61% of Prometheus's compliance tests, which greatly improved PromQL compatibility! Track the latest progress in this issue.

Join our community

Get the latest updates and discuss with other users.