Log in

Cambridge users (raven) details

Other users details

No account? details

Information on

Subscribing to talks details

Finding a talk details

Adding a talk details

Disseminating talks details

Help and Documentation details

A modular architecture for Unicode text compression

Add to your list(s) Download to your calendar using vCal

Adam Gleave (University of Cambridge)
Tuesday 14 June 2016, 15:00-15:15
Cambridge University Engineering Department, CBL Seminar room BE4-38.

If you have a question about this talk, please contact Adam Gleave.

Unicode is now ubiquitous, with 87% of online content in the UTF -8 character encoding. Conventional compression techniques operate on individual bytes: this works well for ASCII , but poorly for UTF -8, where a character can span multiple bytes. Previous attempts at Unicode compression have invented new algorithms from scratch, with generally poor results. My approach is to extend existing data compression algorithms to operate over Unicode characters. I find this substantially improves compression effectiveness for Unicode text, with only a small overhead for ASCII and binary files.

Please note the talk will last for 15 minutes, although I will be available afterwards for any further questions.

This talk is part of the arg58's list series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

Log in

Information on

A modular architecture for Unicode text compression

This talk is included in these lists:

Other lists

Other talks