CSP: Digital Information
Lossless Compression
based on resources from code.org
PBL by Silver Oaks
Food for thought
- LOL
- TC
- GN8
Why might we use abbreviations when sending messages? What are the advantages?
Intro
Text Compression
I want to send this message to a friend, but their phone can only accept 80 characters of text at a time.
Pitter_patter_pitter_patter_listen_to_the_rain_pitter_patter_pitter_patter_on_the_window_pane
I notice this pattern has some repetition in it, so rather than sending the whole message, I send this instead:
☄listen_to_the_rain_☄on_the_window_pane
Using abbreviations and symbols is a form of compression, where we try to represent the same information with fewer characters. The original message had 93 characters, but the new message and key, also called a dictionary, have a total of 56 characters. We’re essentially sending the same information, but with fewer characters. Our goal today will be to create our own text compressions using similar methods.
Activity
Text Compression
The compression percentage at the bottom of the screen is calculated by comparing the number of bytes in the original message and the number of bytes in the compressed message.
- Choose various text options available
- Try compressing the text
- Take screenshot once you are done
- Paste the screenshot in google slide or doc and eexplain
“What strategies you are using to compress your sample text? Which ones seem most successful?” - Submit your google slide or doc file as a part of Google Classroom Assighnment.
You will eventuallty reach the ‘limit’ for how much we can compress a particular message. But not every message can be compressed with a high rating.
Think what makes some messages more compressable than others?
Wrapup
Reflection
- ‘Easier’ texts usually had lots of repetition – repeated words or phrases or syllables. A useful strategy is to use this repetition to create the compression.
- ‘Difficult’ texts usually have less repetition, making it less likely to apply this particular method of compression. Some strategies may actually make compression worse, which can be counter-intuitive
Lossless Compression: A process for reducing the number of bits needed to represent something without losing any information. This process is reversible.