md/r5cache: State machine for raid5-cache write back mode
This patch adds state machine for raid5-cache. With log device, the
raid456 array could operate in two different modes (r5c_journal_mode):
- write-back (R5C_MODE_WRITE_BACK)
- write-through (R5C_MODE_WRITE_THROUGH)
Existing code of raid5-cache only has write-through mode. For write-back
cache, it is necessary to extend the state machine.
With write-back cache, every stripe could operate in two different
phases:
- caching
- writing-out
In caching phase, the stripe handles writes as:
- write to journal
- return IO
In writing-out phase, the stripe behaviors as a stripe in write through
mode R5C_MODE_WRITE_THROUGH.
STRIPE_R5C_CACHING is added to sh->state to differentiate caching and
writing-out phase.
Please note: this is a "no-op" patch for raid5-cache write-through
mode.
The following detailed explanation is copied from the raid5-cache.c:
/*
* raid5 cache state machine
*
* With rhe RAID cache, each stripe works in two phases:
* - caching phase
* - writing-out phase
*
* These two phases are controlled by bit STRIPE_R5C_CACHING:
* if STRIPE_R5C_CACHING == 0, the stripe is in writing-out phase
* if STRIPE_R5C_CACHING == 1, the stripe is in caching phase
*
* When there is no journal, or the journal is in write-through mode,
* the stripe is always in writing-out phase.
*
* For write-back journal, the stripe is sent to caching phase on write
* (r5c_handle_stripe_dirtying). r5c_make_stripe_write_out() kicks off
* the write-out phase by clearing STRIPE_R5C_CACHING.
*
* Stripes in caching phase do not write the raid disks. Instead, all
* writes are committed from the log device. Therefore, a stripe in
* caching phase handles writes as:
* - write to log device
* - return IO
*
* Stripes in writing-out phase handle writes as:
* - calculate parity
* - write pending data and parity to journal
* - write data and parity to raid disks
* - return IO for pending writes
*/
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
diff --git a/drivers/md/raid5.h b/drivers/md/raid5.h
index ffc13c4..c9590a8 100644
--- a/drivers/md/raid5.h
+++ b/drivers/md/raid5.h
@@ -264,6 +264,7 @@
int syncing, expanding, expanded, replacing;
int locked, uptodate, to_read, to_write, failed, written;
int to_fill, compute, req_compute, non_overwrite;
+ int injournal;
int failed_num[2];
int p_failed, q_failed;
int dec_preread_active;
@@ -313,6 +314,11 @@
*/
R5_Discard, /* Discard the stripe */
R5_SkipCopy, /* Don't copy data from bio to stripe cache */
+ R5_InJournal, /* data being written is in the journal device.
+ * if R5_InJournal is set for parity pd_idx, all the
+ * data and parity being written are in the journal
+ * device
+ */
};
/*
@@ -345,7 +351,23 @@
STRIPE_BITMAP_PENDING, /* Being added to bitmap, don't add
* to batch yet.
*/
- STRIPE_LOG_TRAPPED, /* trapped into log */
+ STRIPE_LOG_TRAPPED, /* trapped into log (see raid5-cache.c)
+ * this bit is used in two scenarios:
+ *
+ * 1. write-out phase
+ * set in first entry of r5l_write_stripe
+ * clear in second entry of r5l_write_stripe
+ * used to bypass logic in handle_stripe
+ *
+ * 2. caching phase
+ * set in r5c_try_caching_write()
+ * clear when journal write is done
+ * used to initiate r5c_cache_data()
+ * also used to bypass logic in handle_stripe
+ */
+ STRIPE_R5C_CACHING, /* the stripe is in caching phase
+ * see more detail in the raid5-cache.c
+ */
};
#define STRIPE_EXPAND_SYNC_FLAGS \
@@ -710,4 +732,11 @@
extern int r5l_handle_flush_request(struct r5l_log *log, struct bio *bio);
extern void r5l_quiesce(struct r5l_log *log, int state);
extern bool r5l_log_disk_error(struct r5conf *conf);
+extern bool r5c_is_writeback(struct r5l_log *log);
+extern int
+r5c_try_caching_write(struct r5conf *conf, struct stripe_head *sh,
+ struct stripe_head_state *s, int disks);
+extern void
+r5c_finish_stripe_write_out(struct r5conf *conf, struct stripe_head *sh,
+ struct stripe_head_state *s);
#endif