Scylla Compaction Strategies
Size tiered compaction strategy [STCS]
- STCS organises SSTable into tiers
- the tiers are based on the size of SSTable on an exponential scale
- When compacting several SSTables, a single SSTable is created.
- It may be as large as union of all of them, then it's moved to next tier
- or become much smaller due to deletes and expirations, potentially dropping to lower tier
STCS Space Amplification
- STCS requires space of at least twice the data size, this is called Space Amplification
- Temporary spaces due to compaction and
- Accumulation of updates and deletes across different tiers.
Levelled Compaction Strategy [LCS]
- This compaction is triggered when a level has more than 10 SSTables
- LCS picks one SSTable from level 'i', with size 'x', to compact
- it then roughly finds 10 SSTables in the next level
- Overlapping with SSTable and compacts all of them together
- It then writes the resulting run to next level, run size bound by (1+10)*X
- while LCS limits space amplification but it results in higher write amplification
Time Window Compaction Strategy [TWCS]
- Memtables have a write time, SSTables inherit this write time
- Only SSTables that belong to the same window are compacted together
create table twcs.example (
id int,
value int,
text_value text,
PRIMARY KEY (id, value)
) with clustering order by (value ASC)
AND compaction = {
'compaction_window_size' : '1',
'compaction_window_unit' : 'DAYS',
'class' : TimeWIndowCompactionStrategy',
} ;
- All data in the partition is inserted in the same window or in small number of windows
- Deletes and writes -
- Data was written 1 year ago, data point is in SSTable 1
- Delete will be written now, tombstone is in the SSTable N
- Reads have to read them both
- Data is never really gone
Incremental Compaction Strategy
- Observed problems with legacy compaction Strategies
- STCS and LCS has high space and write Amplification respectively.
- Sorted set of SSTables
- The SSTables are non-overlapping
- those are called Fragments
- A run is equivalent to a large SSTable, split into several smaller SSTables
- Fragments are disjoint and sorted with respect to each other, so we scan the runs, fragment by fragment and compact them increamentally