Tile Generation Tutorial (Waystation edition)¶
In [1]:
Copied!
env DATASET_URL=http://waystation:6077/
env DATASET_URL=http://waystation:6077/
env: DATASET_URL=http://waystation:6077/
We can run the same save_tiles command as before since we already ran the preceeding steps (if you haven't worked through 2_tiling-file, run it first!)
In [2]:
Copied!
import pandas as pd
import pandas as pd
In [3]:
Copied!
!save_tiles \
~/vmount/PRO-12-123/data/toy_data_set/01OV002-bd8cdc70-3d46-40ae-99c4-90ef77.svs \
~/vmount/PRO-12-123/tiling/test/label \
--num_cores 4 --batch_size 200 --dataset_id PRO_TILES_LABELED_S3 \
-o ~/vmount/PRO-12-123/tiling/test/saved_tiles
!save_tiles \
~/vmount/PRO-12-123/data/toy_data_set/01OV002-bd8cdc70-3d46-40ae-99c4-90ef77.svs \
~/vmount/PRO-12-123/tiling/test/label \
--num_cores 4 --batch_size 200 --dataset_id PRO_TILES_LABELED_S3 \
-o ~/vmount/PRO-12-123/tiling/test/saved_tiles
2022-07-14 08:22:34,212 - INFO - root - Initalized logger, log file at: luna.log 2022-07-14 08:22:36,024 - INFO - luna.common.utils - Started CLI Runner wtih <function save_tiles at 0x7ff4b9c26ca0> 2022-07-14 08:22:36,026 - INFO - luna.common.utils - Validating params... 2022-07-14 08:22:36,028 - INFO - luna.common.utils - -> Set input_slide_image (<class 'str'>) = /home/pashaa/vmount/PRO-12-123/data/toy_data_set/01OV002-bd8cdc70-3d46-40ae-99c4-90ef77.svs 2022-07-14 08:22:36,029 - INFO - luna.common.utils - -> Set input_slide_tiles (<class 'str'>) = /home/pashaa/vmount/PRO-12-123/tiling/test/label 2022-07-14 08:22:36,030 - INFO - luna.common.utils - -> Set output_dir (<class 'str'>) = /home/pashaa/vmount/PRO-12-123/tiling/test/saved_tiles 2022-07-14 08:22:36,032 - INFO - luna.common.utils - -> Set num_cores (<class 'int'>) = 4 2022-07-14 08:22:36,033 - INFO - luna.common.utils - -> Set batch_size (<class 'int'>) = 200 2022-07-14 08:22:36,036 - INFO - luna.common.utils - Expanding inputs... 2022-07-14 08:22:36,037 - INFO - luna.common.utils - Attempting to read metadata at /home/pashaa/vmount/PRO-12-123/data/toy_data_set/01OV002-bd8cdc70-3d46-40ae-99c4-90ef77.svs/metadata.yml 2022-07-14 08:22:36,038 - INFO - luna.common.utils - Attempting to read metadata at /home/pashaa/vmount/PRO-12-123/tiling/test/label/metadata.yml 2022-07-14 08:22:36,047 - INFO - luna.common.utils - Expanded input: 2022-07-14 08:22:36,047 - INFO - luna.common.utils - -> /home/pashaa/vmount/PRO-12-123/tiling/test/label 2022-07-14 08:22:36,047 - INFO - luna.common.utils - -> /home/pashaa/vmount/PRO-12-123/tiling/test/label/01OV002-bd8cdc70-3d46-40ae-99c4-90ef77.regional_label.tiles.parquet 2022-07-14 08:22:36,049 - INFO - luna.common.utils - Found segment keys: {'dsa_collection_uuid': '62ce95d4ff5873883c9dae25', 'slide_id': '01OV002-bd8cdc70-3d46-40ae-99c4-90ef77'} 2022-07-14 08:22:36,051 - INFO - luna.common.utils - Resolved input: 2022-07-14 08:22:36,051 - INFO - luna.common.utils - -> /home/pashaa/vmount/PRO-12-123/data/toy_data_set/01OV002-bd8cdc70-3d46-40ae-99c4-90ef77.svs 2022-07-14 08:22:36,051 - INFO - luna.common.utils - -> /home/pashaa/vmount/PRO-12-123/data/toy_data_set/01OV002-bd8cdc70-3d46-40ae-99c4-90ef77.svs 2022-07-14 08:22:36,052 - INFO - luna.common.utils - Resolved input: 2022-07-14 08:22:36,052 - INFO - luna.common.utils - -> /home/pashaa/vmount/PRO-12-123/tiling/test/label/01OV002-bd8cdc70-3d46-40ae-99c4-90ef77.regional_label.tiles.parquet 2022-07-14 08:22:36,052 - INFO - luna.common.utils - -> /home/pashaa/vmount/PRO-12-123/tiling/test/label/01OV002-bd8cdc70-3d46-40ae-99c4-90ef77.regional_label.tiles.parquet 2022-07-14 08:22:36,055 - INFO - luna.common.utils - Full segment key set: {'dsa_collection_uuid': '62ce95d4ff5873883c9dae25', 'slide_id': '01OV002-bd8cdc70-3d46-40ae-99c4-90ef77'} 2022-07-14 08:22:36,056 - INFO - luna.common.utils - ------------------------------------------------------------ 2022-07-14 08:22:36,056 - INFO - luna.common.utils - Starting transform::save_tiles 2022-07-14 08:22:36,056 - INFO - luna.common.utils - ------------------------------------------------------------ 2022-07-14 08:22:36,093 - INFO - generate_tiles - Now generating tiles with num_cores=4 and batch_size=200! 2022-07-14 08:22:38,883 - WARNING - generate_tiles - /home/pashaa/vmount/PRO-12-123/tiling/test/saved_tiles/01OV002-bd8cdc70-3d46-40ae-99c4-90ef77.tiles.h5 already exists, deleting the file.. 100%|███████████████████████████████████████████| 26/26 [00:43<00:00, 1.68s/it] 2022-07-14 08:23:24,029 - INFO - generate_tiles - x_coord ... tile_store 2022-07-14 08:23:24,029 - INFO - generate_tiles - address ... 2022-07-14 08:23:24,029 - INFO - generate_tiles - x3_y81_z10.0 1536 ... /home/pashaa/vmount/PRO-12-123/tiling/test/sav... 2022-07-14 08:23:24,029 - INFO - generate_tiles - x3_y84_z10.0 1536 ... /home/pashaa/vmount/PRO-12-123/tiling/test/sav... 2022-07-14 08:23:24,029 - INFO - generate_tiles - x3_y85_z10.0 1536 ... /home/pashaa/vmount/PRO-12-123/tiling/test/sav... 2022-07-14 08:23:24,029 - INFO - generate_tiles - x3_y86_z10.0 1536 ... /home/pashaa/vmount/PRO-12-123/tiling/test/sav... 2022-07-14 08:23:24,029 - INFO - generate_tiles - x3_y87_z10.0 1536 ... /home/pashaa/vmount/PRO-12-123/tiling/test/sav... 2022-07-14 08:23:24,029 - INFO - generate_tiles - ... ... ... ... 2022-07-14 08:23:24,029 - INFO - generate_tiles - x93_y65_z10.0 47616 ... /home/pashaa/vmount/PRO-12-123/tiling/test/sav... 2022-07-14 08:23:24,029 - INFO - generate_tiles - x93_y66_z10.0 47616 ... /home/pashaa/vmount/PRO-12-123/tiling/test/sav... 2022-07-14 08:23:24,029 - INFO - generate_tiles - x93_y67_z10.0 47616 ... /home/pashaa/vmount/PRO-12-123/tiling/test/sav... 2022-07-14 08:23:24,029 - INFO - generate_tiles - x93_y68_z10.0 47616 ... /home/pashaa/vmount/PRO-12-123/tiling/test/sav... 2022-07-14 08:23:24,029 - INFO - generate_tiles - x93_y69_z10.0 47616 ... /home/pashaa/vmount/PRO-12-123/tiling/test/sav... 2022-07-14 08:23:24,029 - INFO - generate_tiles - 2022-07-14 08:23:24,029 - INFO - generate_tiles - [5089 rows x 9 columns] 2022-07-14 08:23:24,077 - INFO - luna.common.utils - Code block 'transform::save_tiles' took: 48.08929709997028s 2022-07-14 08:23:24,081 - INFO - luna.common.utils - ------------------------------------------------------------ 2022-07-14 08:23:24,081 - INFO - luna.common.utils - Done with transform, running post-transform functions... 2022-07-14 08:23:24,081 - INFO - luna.common.utils - ------------------------------------------------------------ 2022-07-14 08:23:24,090 - INFO - luna.common.utils - Adding feature segment /home/pashaa/vmount/PRO-12-123/tiling/test/saved_tiles/01OV002-bd8cdc70-3d46-40ae-99c4-90ef77.tiles.parquet to PRO_TILES_LABELED_S3 2022-07-14 08:23:24,092 - INFO - luna.common.utils - Found dataset URL = http://waystation:6077/ 2022-07-14 08:23:24,093 - INFO - luna.common.utils - Adding /home/pashaa/vmount/PRO-12-123/tiling/test/saved_tiles/01OV002-bd8cdc70-3d46-40ae-99c4-90ef77.tiles.parquet to PRO_TILES_LABELED_S3 via http://waystation:6077/ 2022-07-14 08:23:24,095 - INFO - luna.common.utils - SEGMENT_ID=62ce95d4ff5873883c9dae25-01OV002-bd8cdc70-3d46-40ae-99c4-90ef77 2022-07-14 08:23:24,097 - INFO - luna.common.utils - Posting to: http://waystation:6077/datasets/PRO_TILES_LABELED_S3/segments/62ce95d4ff5873883c9dae25-01OV002-bd8cdc70-3d46-40ae-99c4-90ef77 2022-07-14 08:24:02,769 - INFO - luna.common.utils - <Response [500]>: <!doctype html> 2022-07-14 08:24:02,769 - INFO - luna.common.utils - <html lang=en> 2022-07-14 08:24:02,769 - INFO - luna.common.utils - <title>500 Internal Server Error</title> 2022-07-14 08:24:02,769 - INFO - luna.common.utils - <h1>Internal Server Error</h1> 2022-07-14 08:24:02,769 - INFO - luna.common.utils - <p>The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application.</p> 2022-07-14 08:24:02,772 - INFO - luna.common.utils - Done.
We can see that we got a REST response <Response [200]>: {"dsid":"PRO_TILES_S3","rows_written":1303,"sid":"626c0a1dbfa0f49e3e026f6a-01OV002-bd8cdc70-3d46-40ae-99c4-90ef77","status":"success"}, so our 1303 tiles for this slide ID were successfully written!
Since waystation was configured with a local minio instance, we can read our dataset directly using S3
In [ ]:
Copied!
df_tiles = pd.read_parquet(
"s3://datasets/PRO_TILES_LABELED_S3/",
storage_options={
"key": 'admin',
"secret": 'password',
"client_kwargs": {"endpoint_url": "http://minio:9000"}
}
)
df_tiles
df_tiles = pd.read_parquet(
"s3://datasets/PRO_TILES_LABELED_S3/",
storage_options={
"key": 'admin',
"secret": 'password',
"client_kwargs": {"endpoint_url": "http://minio:9000"}
}
)
df_tiles
In [ ]:
Copied!
from luna.common.utils import LunaCliClient
import os
def pipeline (slide_id, input_slide, input_annotations):
print (os.environ)
client = LunaCliClient("~/vmount/PRO-12-123/2_tiling-waystation", slide_id)
client.bootstrap("slide", input_slide)
client.bootstrap("annotations", input_annotations)
client.configure("generate_tiles", "slide",
tile_size=128,
requested_magnification=10
).run("source_tiles")
client.configure("detect_tissue", "slide", "source_tiles",
filter_query="otsu_score > 0.1",
requested_magnification=2
).run("detected_tiles")
client.configure("label_tiles", "annotations", "detected_tiles").run("labled_tiles")
client.configure( "save_tiles", "slide", "labled_tiles",
num_cores=4, batch_size=200, dataset_id='PRO_TILES_LABELED_S3'
).run("saved_tiles")
from luna.common.utils import LunaCliClient
import os
def pipeline (slide_id, input_slide, input_annotations):
print (os.environ)
client = LunaCliClient("~/vmount/PRO-12-123/2_tiling-waystation", slide_id)
client.bootstrap("slide", input_slide)
client.bootstrap("annotations", input_annotations)
client.configure("generate_tiles", "slide",
tile_size=128,
requested_magnification=10
).run("source_tiles")
client.configure("detect_tissue", "slide", "source_tiles",
filter_query="otsu_score > 0.1",
requested_magnification=2
).run("detected_tiles")
client.configure("label_tiles", "annotations", "detected_tiles").run("labled_tiles")
client.configure( "save_tiles", "slide", "labled_tiles",
num_cores=4, batch_size=200, dataset_id='PRO_TILES_LABELED_S3'
).run("saved_tiles")
In [ ]:
Copied!
from concurrent.futures import ThreadPoolExecutor
import pandas as pd
df_slides = pd.read_parquet("../PRO-12-123/data/toy_data_set/table/SLIDES/slide_ingest_PRO-12-123.parquet")
with ThreadPoolExecutor(5) as pool:
for index, row in df_slides.iterrows():
print (index)
pool.submit(pipeline, index, row.slide_image, "../PRO-12-123/data/toy_data_set/table/ANNOTATIONS")
from concurrent.futures import ThreadPoolExecutor
import pandas as pd
df_slides = pd.read_parquet("../PRO-12-123/data/toy_data_set/table/SLIDES/slide_ingest_PRO-12-123.parquet")
with ThreadPoolExecutor(5) as pool:
for index, row in df_slides.iterrows():
print (index)
pool.submit(pipeline, index, row.slide_image, "../PRO-12-123/data/toy_data_set/table/ANNOTATIONS")
Now, we when we read the dataset again, we see our tiles from all slides
In [ ]:
Copied!
df_tiles = pd.read_parquet(
"s3://datasets/PRO_TILES_LABELED_S3/",
storage_options={
"key": 'admin',
"secret": 'password',
"client_kwargs": {"endpoint_url": "http://minio:9000"}
}
)
print (df_tiles['regional_label'].value_counts())
df_tiles
df_tiles = pd.read_parquet(
"s3://datasets/PRO_TILES_LABELED_S3/",
storage_options={
"key": 'admin',
"secret": 'password',
"client_kwargs": {"endpoint_url": "http://minio:9000"}
}
)
print (df_tiles['regional_label'].value_counts())
df_tiles
We still have 2120 tumor, 860 stroma, and 751 fat tiles images and labels ready to train your model, this time aggregated at an S3 endpoint!