>>> HI, EVERYONE. IMSO GLAD MANY 
OF YOU HAVE MADE IT TO THIS SESSION. 
NOT THE BEST SPOT TO HAVE SO THANK 
YOU SO MUCH TO COMING TO MY SESSION. 
MY NAME IS THOMAS. MY NAME IS THOMAS. 
BY SECOND-DEGREE IS ALL ABOUT DATA 
MODELING AND DATA PARTITIONING. 
SO LET'S START WAY QUICK SHOW OF 
HANDS. HOW MANY RELATIONAL DATABASE 
USEERS DO WE HAVE IN THE ROOM? GOOD. 
SO YOU MADE IT TO THE RIGHT SESSION. 
MY GOAL OVER THE NEXT 45 MINUTES 
IS TO HELP YOU GET A BETTER UNDERSTANDING 
OF WHAT IT MEANS TO MODEL AND PARTITION 
DATA. AS I USUALLY DO WHEN I'M GIVING 
SUCH PRESENTING AS, INSTEAD OF GIVING 
YOU A DRY INTRODUCTION TO DIFFERENT 
CONCEPTS, I AM GOING TO EXPLORE 
THEM IN THE CONCEPT OF A REAL WORLD 
EXAMPLE. I'LL DID MY BEST TO MAKE 
SURE THAT YOU AND WHAT ARE THE MAIN 
DIFFERENCES BETWEEN MODELING DATA 
FOR CUSTOMERS DB OR FOR A RELATAL 
DATABASE. WHAT IS THE CUSTOMERS 
DB. WE HAVE SO MANY DIFFERENT WAYS 
TO ANSWER THAT. ATE GLOBALLY DISTRIBUTED, 
ALWAYS ON DATABASE. BUT FOR THE 
PURPOSE OF MY PRESENTATION I'LL 
INTRODUCE ISES IT A NO SEQUEL BAT 
ACASE. IT IS NONRELATIONAL AND HORIZONTALLY 
SCALABLE. LET'S START WITH HORIZONTALLY 
SCALABLE. SO WHENEVER YOU ARE WORKING 
WITH THIS, YOU HAVE TO UNDERSTAND 
THAT UNLESS YOU ARE WORKING ON A 
VERY TRIVIAL KIND OF USE CASE WITH 
VERY LOW REQUIREMENTS IN TIMES OF 
STORAGE, MOST LIKELY, YOUR DATA 
WILL BE STORED IN NOT ONLY ONE PHYSICAL 
SERVER BUT ON A NUMBER OF DIFFERENT 
PHYSICAL SERVE SRS. THIS IS SOMETHING 
WE OBSTRUCT. UNDER THE HOOD, YOU 
ARE DEALING WITH A CLUSTER OF MULTIPLE 
PHYSICAL SERVERS. THIS IS HOW WE 
SCALE. THIS IS HOW WE DELIVER UNLIMBATED 
STORAGE CAPACITY BECAUSE WHENEVER 
YOU NEED MORE STORABLE, WE ADD MORE 
SERVERS TO YOUR CLAST TORAH COME 
DATE THAT. THIS IS ALSO HOW WE ACHIEVE 
UNLIMITED -- HE WE ADD MORE SOBERS. 
WE ADD MORE COMPUTES ARE TO MAKE 
SURE WE CAN MEET YOUR REQUIREMENTS. 
THE SECOND ONE IS NONRELATIONAL. 
>> WHEN YOU WORKING WITH A RELATIONAL 
DATABASE, YOU HAVE WAYS TO DEFINE 
THE CONSTRAINTS BETWEEN WHAT YOU 
ARE STORING IN THE DATABASE. IT 
LETS YOU DEFINE FOREIGN KEYS AND 
PERFORM JOINT OPERATIONS ACROSS 
THE DATA SET. NONRELATIONAL DATABASES 
DON'T IMPLEMENT SUCH RELATIONAL 
CONSTRUCTS. AS I JUST MENTIONED, 
WE ARE HOARE DON'TALLY SCALABLE. 
YOUR DATA WILL MOST LIKE SIT ON 
NOT JUST ONE BUT MULTIPLE PHYSICAL 
SERVERS. TENS, HOUNDS, MAYBE THOUSANDS 
DEPENDING YOUR STORAGE AND THROUGH 
PUT REQUIREMENTS. I AM NOT SAYING 
IT WOULD BE TECHNICAL IMPOSSIBLE 
TO ENFORCE RELATIONAL CONSTRAINTS 
ACROSS A CLUSTER OF SERVERS BUT 
THIS WILL HAVE AN IMPACT ON THE 
DATABASE AND ITS ABILITY. BECAUSE 
IT RUNS ON WHEN WE CALL A PREDICTABLE 
PERFORMANCE MODEL, WE DON'T EXPECT 
ANY WAY TO DEFINE RELATION APPEAR 
CONSTRAINTS. NOW, CUSTOMERSS D ABOUT. 
IS A NONRELATIONAL DATABASE. IS 
IT STILL SUITABLE FOR RELATIONAL 
WORKLOADS? THE ANSWER IS YES OR 
THIS WOULD BE MY LAST SLIDE AND 
I COULD LET YOU GO. WHEN YOUEN ABOUT 
T MOST OF THE REAL WORLD, REAL LIFE 
USE CASES WE ARE DEALING WITH, ALWAYS 
MORE OR LESS RELATION APPEAR. BECAUSE 
IT DOESN'T LET YOU DEFINE RELATIONAL 
CONSTRAINTS, WE HAVE TO USE DIFFERENT 
PRACTICES TO MATERIALIZE THE DIFFERENT 
PRACTICES. THE KEY THING TO UNDERSTAND 
HERE IS THAT THIS APPROACH IS VERY 
DIFFERENT TO WHAT YOU USUALLY DO 
WHEN YOU ARE MODELING DATA ON A 
RELATIONAL DATABASE. ALL THOSE BEST 
PRACTICES, ALL THOSE INTUITIONS 
THAT HAVE YOU BUILT OVER THE PAST 
DECADE, THOSE BEST PRACTICES THEY 
DON'T TRANSLATE VERY WELL TO THE 
NONRELATIONAL WORLD APPEAR MANY 
TIMES, THEY ARE ALSO VERY FEAR NOT. 
I AM HER WITH YOU TODAY TO INTRODUCE 
TOUT RIGHT WAY TO MODEL DATA A NO 
SEQUEL DATABASE. SO AS I SAID, MOST 
OF THE TIME I LIKE TO EXPLORE THE 
CONSENTS IN THE LIGHT OF A REAL 
WORLD EXAMPLE. THE USE CASE I HAVE 
CHOSEN TODAY IS THE CASE OF A E 
COMMERCE WORK LOAD. WHAT WE SEE 
HERE IS THAT WE HAVE CUSTOMERS OBVIOUSLY. 
EACH CUSTOMER CAN HAVE MULTIPLE 
ADDRESSES. THIS IS A ONE TO MANY 
RELATIONSHIP. EACH CUSTOMER CAN 
HAVE ONE AND ONLY ONE PASSWORD THAT 
IS HERE STORED IN A DIFFERENT ENTITY. 
WE HAVE PRODUCTS. EACH PRODUCT BELONGS 
TO JUST ONE CATEGORY. THAT IS ANOTHER 
ONE TO MANY RELATIONSHIPS. WE HAVE 
PRODUCT TAGS SO EACH PRODUCT CAN 
BE TAGGED WITH MULTIPLE TAGS. THIS 
IS A MANY TO MANY RELATIONSHIP. 
EACH PRODUCT CAN HAVE MULTIPLE TAGS. 
EACH TAG CAN REFER TO MULTIPLE PRODUCTS. 
LASTLY, WE HAVE SALES ORDERS. THESE 
ENTITIES, THEY MATERIALIZE A PURCHASE. 
EVERY TIME A CUSTOMER DOES A PURCHASE, 
WE CREATE A SALES ORDER AND WE ATTACH 
TO THAT SALES ORDER SALE ORDER DETAILS. 
THOSE ARE COPPIES OF THE PRODUCTS 
AT THE TIME OF THE PURCHASE. SO 
WHAT ABOUT WE WOULD IMPLEMENT THAT 
ON CUSTOMERS DB TODAY. LET'S START 
WITH THE CUSTOMERS. SO IT IS A DOCUMENT 
DATABASE SO WE WILL STORE ADJACENT 
DOCUMENTS. THE FIRST STEP IS TO 
TRANSLATE THOSE DIFFERENT ENTITY 
TYPES INTO DOCUMENTS. WE ARE GOING 
TO LOOK AT ALL THE DOCUMENTS THIS 
THESE ENTITY TYPES DEFINE. WE HAVE 
ONE DOCUMENT FOR THE CUSTOMER, ONE 
FOR THE CUSTOMER ADDRESS AND ONE 
FOR THE CUSTOMER PASSWORD. SO WHAT 
YOU CAN NOTE HERE IS IF I JUST DO 
THAT, I AM KEEPING THE REFERENCES 
BETWEEN THOSE THREE ENTITY TYPES. 
ADDRESSES AND PASSWORDS HAVE A CUSTOMER 
ID. I KEEP THAT REFERENCE BETWEEN 
THOSE DIFFERENT ENTITIES. BUT BECAUSE 
WE ARE JUST STORING AN WE CAN STORE 
ANY KIND OF FILE, ANOTHER WAY OF 
MATERIALIZING OUR CUSTOMER WOULD 
BE TO PUT EVERYTHING IN JUST ONE 
DOCUMENT. WHAT IF WE STARTED WITH 
ALL THE DATA THAT DEFIES THE CUT 
MIR AND TO THAT ORIGINAL DOCUMENT, 
WE WILL JUST ADDEN ARARE CONTAINING 
THE ADDRESSESS AND ANOTHER SUBJECT 
TO MATERIALIZE THE PASSWORD DATA. 
TWO DIFFERENT WAYS TO MATERIALIZE 
THAT. WE CAN EMBED EVERYTHING IN 
THE SAME DOCUMENT OR WE CAN KEEP 
THOSE DIFFERENT ENTITIES SEPARATED 
IN DIFFERENT DOCUMENTS. WHO THINKS 
WE SHOULD GO WITH THE FIRST OPTION 
WHICH IS EMBEDDING? OKAY. WHO THINKS 
WE SHOULD REFERENCE AND KEEP THE 
THINGS SEPARATED. INTERESTING W 
A DAY DESCRIBED AUDIENCE TODAY. 
THE RULES TO FOLLOW WHEN YOU DECIDE. 
EMBEDDING WORKS WHEN YOU HAVE A 
ONE TO ONE OR ONE TO FEW RELATIONSHIP 
BETWEEN THE ENTITIES. WE HAVE A 
ONE TO ONE BETWEEN THE CUSTOMER 
AND PASS WORD AND A ONE TO FEW BETWEEN 
THE CUSTOMER AND ADDRESSES. THERE 
IS AN UPPER BOUND TO HOW MANY ADDRESSES 
WE WILL LET THEM DEFINE IF THE E-COMMERCE 
PLATFORM. IT MAKE SENSE TO LET EVERYTHING 
SIT TOGETHER EVEN MORE BECAUSE WHEN 
WE ARE GOING FETCH A CUSTOMER, MOST 
PROBABLY WE ARE GOING TO -- WE WILL 
WANT TO FETCH EVERYTHING. NOT ONLY 
THE CUSTOMER BUTT ADDRESSES, MAYBE 
EVEN THE PASSWORD. BY PUTTING EVERYTHING 
IN JUST ONE DOCUMENT, WE JUST HAVE 
TO READ THAT ONE DOCUMENT AND GET 
ALL THE DATA THAT IS RELATED TO 
THE CUSTOMER. IF IT WOULD LET OUR 
CUSTOMERS DEFINE AS MANY AS WE WOULDN'T, 
THIS WOULD BLOAT OUR J SON DOCUMENT. 
AND REFERENCE IS WORK BEST WHEN 
THE DIFFERENT RATE ITEMS ARE EITHER 
QUERIED OR BATED INDEPENDENTLY. 
IF WE WERE TO FETCH ONE ADDRESS 
AND ONE ADDRESS ONLY, WE WOULD STICK 
WITH REFERENCING. THAT IS NOT OUR 
USE CASE HERE. WHAT MAKE SENSE IS 
TO EMBED EVERYTHING IN THE SAME 
JSON DOCUMENT. THAT IS ALL FIRST 
ENTITY. WE ARE MAKING GOOD PROGRESS 
ON THE MODELING EXERCISE. NOW THAT 
WE HAVE DEFINED OUR FIRST ENTITY, 
WHERE ARE WE GOING TO START THAT 
THING. WE ARE GOING TO START IN 
THE CUSTOMERS DB CON TOWNER. WE 
ARE GOING TO NAME THAT THING CUSTOMER. 
NO PRICE HERE. THIS MEANS PARTITION 
KEY, PK. WHEN YOU EVER CREATING 
A CUSTOMERS DB CONTAINER, YOU HAVE 
TO DEFINE THE PARTITION KEY OF THAT 
CONTAINER. THE RIGHT TIME RIGHT 
NOW TO DIVE INTO PAR DECISIONING 
AND TO MAKE SURE YOU ALL UNDERSTAND 
WHAT IS PARTITIONING ON CUSTOMERS 
DB. AS I MENTIONED WHENEVER YOU 
ARE INNER ACTING WITH THE CUSTOMERS 
DB CONTAINER, THIS IS AN ABSTRACTION 
ON TO HAVE AFTER PHYSICAL SERVERS. 
EVERY TIME YOU ARE WRITING DOCUMENT 
TO THE CUSTOMERS DB CONTAIN ARE, 
THOSE DOCUMENTS ARE DISPATCHED AND 
WRITTEN TO DIFFERENT PHYSICAL SERVERS. 
HOW DO WE DECIDE THAT THIS SHOULD 
BE WRITTEN TO THIS SERVER AND THAT 
DOCUMENT SHOULD BE WRITTEN TO THAT 
SERVER? TECHNICALLY WEB DON'T DIRECTLY 
ASSOCIATE DOCUMENTS WITH PHYSICAL 
SERBERS. WE WRITE YOUR DOCUMENTS 
INTO VIRTUAL BUCKETS OF DATA AND 
THESE ARE CALLED LOGICAL PARTITIONS. 
IT IS THOSE THAT HAPPEN TO SIT ON 
DIFFERENT PHYSICAL SERVERS. SO FOR 
THE EXERCISE OF DATA MODELING, USUALLY 
I RECOMMEND TO JUST FORGET ABOUT 
PHYSICAL SERVERS BECAUSE THIS IS 
OUR OWN IMPLEMENTATION BUT YOU HAVE 
TO UNDERSTAND WHENEVER YOU ARE WRITING 
DOCUMENTS, THOSE DOCUMENTS END UP 
IN DIFFERENT LOGICAL PARTITIONS. 
THE QUESTION REMAINS THE SAME. HOW 
DO WE DECIDESIDE THIS THIS DOCUMENT 
SHOULD GO IN THAT LOGICAL PARTITION 
AND THIS DOCUMENT INTO THAT PARTITION. 
HAVE YOU TO DEFINE THE PARTITION 
KEY OF THAT CONTAINER. THIS IS THE 
NAME OF THE PROPERTY WE ARE GOING 
TO LOOK UP IN YOUR JSON DOCUMENTS 
IN ORDER TO UNDERSTAND WHICH PARTITION 
IT BELONGS TO. IF WE TOOK THE EXAMPLE 
OF A CONTAINER PARTITIONED BY NAME 
WE WOULD END UP WITH DIFFERENT LOGICAL 
PARTITIONS, ONE CONTAINING ALL THE 
DOCUMENTS WHERE USE ARE NAME EQUALS 
END ROUTE. AND THIS ONE WHERE USER 
NAME EQUALS DEBORAH AND SO ON. FIRST, 
WE HAVE AN UPPER LIMB. >>> THE SIZE 
OF EACH JSON DOCUMENT. THAT LIMIT 
RIGHT NOW IS TWO MEGABITES. IT IS 
IMPOSSIBLE TO STORE A DOCUMENT THAT 
IS BIGGER THAN 2 MIG ABEATS. >> 
I CONTINUE TO APPLY MORE OR LESS 
OTHER THAN DISTRIBUTION ACROSS THE 
MY DIFFERENT LOGICAL PARTITION I 
DON'T WANT TO ENUP WITH A DESIGN 
WHERE MOST OF MY DOCUMENTS ARE WRITTEN 
TO ONE LOGICAL PARTITION AND THE 
OTHER CONTAINS JUST TWO OR THREE 
DOCUMENTS. WHAT WE LOOK IS FOR A 
DESIGN WHERE YOUR DOCUMENTS ARE 
MORE OR LESS EVENLY SPREAD ACROSS 
THE LOGICAL PARTITIONS. SAME GOES 
FOR THE THROUGH PUT. WE DON'T WANT 
TO DESIGN WHERE MOST OF YOUR QUESTIONS 
WILL HIT JUST ONE LOGICAL PARTITION. 
THIS WOULD BE A BOTTLENECK AND ANOTHER 
KIND OF PARTITION. WE WANT YOUR 
REQUESTS ARE AGAIN MOTHER OR LESS 
EVENLY DISTRIBUTED ACROSS YOUR DIFFERENT 
PARRATIONS. THE CHOICE OF THE PARTITION 
KEY IS SUPER IMPORTANT WHEN IT COMES 
TO THE SCALABILITY OF YOUR DATA 
MODEL. WE HAVE A PARTITION BY USE 
ARE NAME. WE HAVE THE USER NAME 
EQUALS MARK. YOU CAN SEE THIS YEAR 
QUESTION FILL FILTERS ON USER NAME. 
ALL THE RESULTS THEY SIT IN JUST 
ONE LOGICAL PARTITION WHICH IS THE 
ONE WHERE USE ARE NAME EQUALS MARK. 
AND COSMOS DB UNDERSTAND WHAT WE 
CALL A SINGLE PARTITION QUERY. REROUTE 
THAT TO THE ONLY LOGICAL PARTITION 
THAT IS CONTAINING THE RESULTS. 
WHAT THAT MEAN IS THIS QUERY WILL 
HIT ONE AN ONLY ONE PHYSICAL SERVER. 
THIS WILL BE MOTHER OR LESS THE 
SAME NO MATTER HOW MUCH DATA YOU 
HAVE IN THE CONTAINER. WHAT HAPPENS 
IF I DEATH PENALTY FELT OR THE PARTITION 
KEY. IF I WERE TO FELT ARE UNDER 
JUST FABRIC COLOR FOR EXAMPLE, SO 
WE ARE ABSOLUTELY NO CLUE. WE CANNOT 
GUESS. WE HAVE THE RESULTS FOR THAT 
QUERY APPEAR THE BEST THING WE CAN 
DO IT IS FAN THAT OUT IT EACH AND 
EVERY LOGICAL PARTITION THAT ARE 
UNDERLYING YOUR CONTAINER AN AS 
YOU UNDERSTAND, THIS IS A YEAR QUESTION 
THAT WILL HIT EACH AND EVERY PHYSICAL 
SERVER THAT ARE UNDER THE HOOD. 
A CROSS PARTITION WORKS. TECHNICALLY, 
WE SUPPORT THAT FAN OUT BUT YOU 
HAVE TO KEEP IN MIND IT HAS AN IMPACT 
ON THE LATENCY OF THE QUERY. IT 
WILL TAX A BIT MORE TIME BUT ALSO 
ON THE NUMBER OF REQUEST UNITS THAT 
THIS QUERY GOING TO CONSUME. THIS 
IS OUR PERFORMANCE CURRENCY. EVERY 
TIME YOU ISSUE AN OPERATION AGAINST 
COSMOS DB, IT HAS A HARDWARE COST 
AN WE HAVE ABSTRACT THE THE COST 
THAT IS RELATED TO CPU MEMORY AND 
IO WITH THE PERFORMANCE CURRENCY. 
SO ANY OPERATION HAS A COST IN REQUEST 
UNITS. WHAT WE TRY TO ACHIEVE WHEN 
WE ARE MODELING DATA ON COSMOS DB 
IS TO END UP WITH A DESIGN WHERE 
YOUR MOST CRITICAL OPERATIONS ARE 
CHEAP IN TERMS OF THE QUESTIONS. 
THAT ILL WITH NOT BE THE CASE FOR 
A CROSS PARTITION QUERY. NOW THAT 
WE ARE STRONG WITH ALL THAT KNOWLEDGE 
ABOUT PARTITIONING, LETS GO BACK 
TO OUR EXAMPLE TO FINE THE BEST 
PARTITION KEY FOR OUR CUSTOMER. 
THERE IS FOR WAY TO GUESS THE BEST 
PARTITION KEY JUST LOOKING AT THE 
DATA. THIS IS SOMETHING YOU CAN 
FIND OUT LOOKING AT THE OPERATIONS 
WHICH ARE THE REQUESTS. MOST IMPORTANT 
REQUESTS THAT WE ARE GOING TO ISSUE 
AGAINST THE CUSTOMERS. FOR EXAMPLE, 
I HAVE JUST DEFINE TWO OF THEM. 
WE SHOULD ABLE TO CREATE A CUSTOMER 
AND RETRIEVE A CUSTOMER. SO WHEN 
I LOOK BACK INTO MY E- COMMERCE 
WEB SITE I CLICK ON PIE PROFILE, 
I SHOULD BE ABLE TO SEE MY USER 
PROFILE. WE SHOULD ABLE TO RETRIEVE 
A CUSTOMER BY THE ID. AND THAT OPERATION 
IS ONLY GOING TO FILTER ON THE I 
DIVA OF THE EXHAUST CUSTOMER, ID 
AS PROBABLY A GOOD CANDIDATE FOR 
THE PAR AT THIS KEY HERE. WHAT IS 
GOING TO HAPPEN IF WE CHOOSE THE 
ID AS THE PARTITION KEY WE WILL 
END UP WITH AS MANY LOGICAL PARTITIONS 
AS DOCUMENTS IN OUR DATABASE. AND 
EACH LOGICAL PARTITION WILL ONLY 
CONTAIN ONE DOCUMENT BECAUSE THERE 
IS A ONE TO ONE BETWEEN THE ID AND 
THE PARTITION KEY. AND THAT IS FINE. 
OKAY. SOME PEOPLE SOMETIMES ARE 
SCARED IN CHOOSING A PARTITION KEY 
THAT HAS A HIGH CARDINALITY AND 
A HIGH SPECTRUM OF POTENTIAL VALUES. 
THIS IS ACTUALLY FINE IF I HAVE 
MILLIONS OF CUSTOMERS, I WILL END 
UP WITH MILLIONS OF LOGICAL PARTITIONS. 
IT DOESN'T MEAN I WILL HAVE MILLIONS 
OF UNDERLYING PHYSICAL SEX SRS. 
IT IS A GOOD THING TO CHOOSE A PARTITION 
KEY THAT HAS A HIGH CARDINALITY. 
IF I HAVE MY CUSTOMER'S CONTAINER 
THAT IS PARTITIONED BY ID, IF I 
JUST WANTED IT RETRIEVE A CUSTOMER 
BY ID, WHAT DOES IT LIKE. HERE, 
I'M GOING TO THE AZURE PORTAL AN 
GOING TO USE OUR BRAND-NEW NOTEBOOK 
EXPERIENCE. I DON'T KNOW IF SOME 
OF YOU HAVE TRIED OUR INTEGRATED 
NOTEBOOK EXPERIENCE. YOU SHOULD. 
THIS IS GREAT. NOW WHEN YOU ARE 
CREATING A COSMOS DB ACCOUNT, YOU 
CAN BOOK IN AND HAVE AN EASY WAY 
TO EXPLORE AND VISUALIZE YOUR DATA. 
IT IS PRETTY Y TO READ. IN ORDER 
TO JUST FETCH ONE DOCUMENT, ONE 
CUT MIR BY ITS ID, I WILL CALL THAT 
METHOD CALLED READ ITEM. I JUST 
GIVE THE ID AND THE PARTITION KEY 
TO COSMOS DB, I GET BACK THE CORRESPONDING 
DOCUMENT. I JUST PASS THESE TWO 
VALUES TO MY METHOD AND WHAT I GET 
BACK IS MY CUSTOMER. WHETHER BOOK 
IN THAT EXAMPLE IS ONE OF THE CUSTOMERS. 
WHAT IS THE COST IN TERM OF REQUEST 
OPERATION. THIS WILL BE VERY CHEAP. 
THIS ONLY COVETED ONE IU. THIS IS 
OUR BASELINE BENCHMARK INFORM YOU 
JUST DEW POINT READ ON ONE DOCUMENT 
BE THAT IS LESS THAN ONE KILO BYTE 
AND IT WILL COST ONE RU AND ONE 
RU ONLY. I LOOK AT THE PROPERTIES 
I HAVE TO MATERIALIZE. I PUT THEM 
IN A JSO. NO DOCUMENT. HERE IS JUST 
TWO OF THEM, ID AN NAME. I'M GOING 
TO PUT THAT DOCUMENT INTO A CATEGORY. 
HOW DO I CHOOSE A PARTITION KEY? 
I AM GOING TO LOOK AT THE MOST ESSENTIAL 
OPERATIONS I NEED TO PERFORM AGAINST 
THE CATEGORIES. I SHOULD BE ABLE 
TO CREATE OR EDIT A CATEGORY BUT 
MOST IMPORTANTLY, I SHOULD BE ABLE 
TO RELEASE ALL THE CATEGORY AVAILABLE 
ON MY E-COMMERCE WEB SITE. SO THE 
SEQUEL QUERY COARSE UPONNING TO 
THAT OPERATION WOULD MOON EVERYTHING 
THAT IS AVAILABLE IN THE CONTAINER. 
FOR THAT SPECIFIC QUERY TO BE A 
SIPPLE PARTITION QUERY, WHAT ILL 
KNEE IS THAT ALL MY CATEGORY SHOULD 
SIT IN THE SAME LOGICAL PARTITION. 
BECAUSE IF I ENTER THAT, I'M SURE 
THAT THIS QUERY WILL ALWAYS JUST 
HAVE ONE LOGICAL PARTITION A LITTLE 
TRICK I'M GOING TO USE, WILL ADD 
AN ADDITIONAL PROPERTY IT MY CATEGORY, 
A CATEGORY CALLED TYPE AND ALL WILL 
HAVE THE SAME VALUE FOR THE TYPE. 
THE GOING TO BE A CONSTANT FIELD 
THAT WILL BE CATEGORY IN MY CASE. 
I'M GOING TO USE THAT TYPE AS THE 
PARTITION KEY FOR MY PRO YOU CAN 
CATEGORIES. THIS MAY LOOK A BIT 
WEIRD TO YOU. NO WORRIES. THIS IS 
JUST AN INTERNET VERSION. WE WILL 
OPTIMIZE THAT MODEL. SO NOW, HOW 
DO I QUERY MY DIFFERENT CATEGORY 
GOING BACK TO THE PORTAL? I CAN 
DO THAT BY JUST DOING A START FROM 
C FOR CATEGORY AGAINST MY PRODUCT 
CATEGORY CONTAINER. IF I EXECUTE 
THAT, I GET ALL THE CATEGORIES. 
AS YOU CAN SEE, EVERY TIME YOU STORE 
A CUSTOMER SOME COSMOS DB WE ADD 
SOME SYSTEM PROPERTIES. SOMETIME 
YOU JUST DON'T NEED TO SEE THEM. 
WHAT YOU CAN DO IN OUR EXAMPLE IF 
YOU JUST WANT THE I DEPARTMENT AND 
NAME BEING I COULD REWRITE MIRE 
QUERY BY JUST PROJECTING THE NAME 
AND THE ID OF THE CATEGORY AND I 
WOULD ONLY GET THE CORRESPONDING 
RESULTS. SO WHAT IS THE COST THAT 
Q104 QUESTION? IT WILL BE MORE EXPENSIVE 
THAN THE POINT READ. IN ORDER TO 
EXERCISE THAT QUERY, THERE IS MORE 
WORK TO BE DONE ON THE BACK END. 
THIS IS 3. 6RUS. THE BASE COST FOR 
A SEQUEL WE ARE ARE QUERY IS AROUND 
3RUS OR MORE. ENAM GOING TO VERY 
QUICKLY GO OVER THE TAGS BECAUSE 
THE TAGS ARE THE VERY SAME STORY 
AS THE CATEGORIES. I START BY PUTTING 
MY PROPERTIES IN A JSON DOCUMENT. 
WILL START A DOCUMENT ANY NEW CONTAIN 
ARE AND PRODUCT TAGS AN LOOKING 
AT THE DIFFERENT ACCESS PATTERNS 
YOU NEED TO FULL WILL WITH THE TAGS 
IS THE VERY SAME AS THE CATEGORY, 
CREATING, READING, LISTING. I WILL 
ADD ANOTHER PROPERTY TO MY TAG. 
I WILL USE THAT AS MY PARTITION 
KEY TO MAKE SURE ALL MY TAGS SIT 
IN THE SAME PARTITION. SOMETHING 
THAT I HAVEN'T MENTIONED, COULD 
BE PROBLEMATIC TO HAVE ALL MY TAGS 
IN ONE LOGICAL PARTITION? CERTAINLY 
NOT BECAUSE THE UPPER LIMIT IS 10 
GIGABYTES. LOOKING AT WHAT WE HAVE 
TO STORE IN A CATEGORY, JUST A I 
DEPARTMENT AND NAME, IT WILL BE 
BE A VERY SMALL JSON DOCUMENT. SMALLER 
THAN ONE KILO BYTE. MEANING I COULD 
STORE MILLION OF DIFFERENT CATEGORY 
WHICH IS PROBABLY GOOD ENOUGH FOR 
MY E- COMMERCE PLATFORM. I AMUING 
THE TYPE ATHE PARTITION KEY FOR 
MID PROD YOU BE. LET'S MOVE TO THE 
PRODUCTS. SAME APPROACH, I CREATE 
MY JSON DOCUMENT WITH ALL OF MY 
PRODUCT ATTRIBUTES. WE HAVE THAT 
MANY TO MANY RELATIONSHIP BETWEEN 
THE PRODUCT AND THE TAGS. THE ONE 
I'M GOING TO FERRELL MATERIALIZE 
THAT IS BY AING AN ARRAY OF TAG 
ID ALWAYS INSIDE MY PRODUCTS. BASICALLY, 
I WILL HAVE TWO DIFFERENT WAYS TO 
MATERIALIZE THAT RELATIONSHIP. EITHER 
I STORE TAG IDS IN THE PRODUCT OR 
STORE PRODUCT IDS IN THE TAGS. ONE 
OF THOSE TWO SIDES WHERE I NEED 
TO MATERIALIZE THE RELATIONSHIP. 
WHY DOES IT MAKE SENSE IT PUT IT 
HERE? I WILL HAVE MUCH LESS TAGS 
PER PRODUCT THAN PRODUCT PER TAG 
N OUR CASE, THAT IS ANOTHER GOOD 
CANDIDATE FOR EMBEDDING BECAUSE 
THE RELATESHIP IS ONE TO FEW. I 
WILL HAVE, WHAT, 10, 15, 20 TAGS 
PER PRODUCT IN THAT IS PROBABLY 
THE HIGH LIPPITT. IT MAKE SENSE 
TO MATERIALIZE THAT BY TAKING ALL 
THE TAG IDS WITH THE PRODUCT. I 
AM GOING TO PUT THAT JSON DOCUMENT 
INTO A NEW CONTAINER. THE INTERESTING 
QUERY HERE IS HITTING ALLED PROTECT 
FROM A CARING. THEY KICK ON THE 
CARING AND RELEASE ALL THE PRODUCT 
COARSE UPONNING TO THAT CATEGORY. 
IN ORDER TO BE USER FRIENDLY, WITH 
I RETURN THE PRODUCT, THE CUSTOMER 
IS EXPECTING TO SEAT FULL NAME OF 
THE CATEGORY AND THE FULL NAME OF 
CORRESPONDING TAGS. THE SEQUEL HERE 
TO GET ALL THE PRODUCT HERE WOULD 
BE ELECTRICITY FROM C WHERE THE 
CATEGORY ID EQUALS THE CATEGORY 
ID I'M LOOK FOR. FOR THAT QUERY 
FOR THAT QUERY TO BE A SINGLE PARTITION 
QUERY, WHAT I NEED IS THAT ALL THE 
PRODUCTS RELATED TO THE SAME CATEGORY 
SHOULD SIT IN THE SAME LOGICAL PARTITION. 
SO WHAT WOULD BE A GOOD PARTITION 
KEY HERE? CATEGORY ID. >> THAT IS 
AN EASY ONE. USUALLY IT IS NOT THAT 
DIFFICULT TO DO ON THE PARTITION 
KEY YOU LOOK ON WHICH PROPERTIES 
WE ARE FILTERING. WHEN YOU ARE DEALING 
WITH MULTIPLE QUERY, HAVE YOU TO 
SEE WHICH PROPERTIES WE ARE COMMONLY 
FILTERING ON BUT IN OUR CASE, IT 
IS PRETTY SIMPLE PROCESS. WE ARE 
FILTERING ON THE CATEGORY ID. THIS 
IS GOING TO BE MY PARTITION KEY. 
SO DOING THAT, I CAN NOW -- LET 
ME CHECK.  YEAH, SORRY. I HAVE A 
LITTLE ISSUE HERE BECAUSE REMEMBER, 
MY REQUIREMENT IN TERMS OF WHAT 
I NEED TO DISPLAY IT MY CUSTOMERS, 
EVERY TIME I RETURN THE PRODUCT, 
I SHOULD ALSO RETURN THE NAME OF 
ITS CATEGORY, THE NAPE OF ITS TAGS. 
IF I WOULDN'T DO ANYTHING ELSE HERE, 
WHAT I WOULD RETURN IS A PRODUCT 
CONTAINING A CATEGORY ID AND AN 
ARRAY OF TAG IDS WHAT I'M MISSING 
HERE IS THE NAME OF THE CATEGORY 
AND THE NAME OF THE TAGS. WHAT THAT 
MEANS IS IN ORDER TO LIST ALL THE 
PRODUCTS FOR A CATEGORY, THAT FIRST 
REQUESTS NOT ENOUGH, RIGHT BECAUSE 
WE ARE MISSING THE NAME OF THE CATEGORY, 
THE NAME OF THE TAGS? IN ORDER TO 
REALLY FULFILL MY BUSINESS REQUIREMENT, 
I WOULD NEED TO ISSUE A SECOND QUERY 
AGAINST THE CATEGORY TO FETCH A 
NAME OF THE CATEGORY I'M LOOKING 
FOR AND THEN FOR EACH RESULT, FOR 
EACH PRODUCT RETURNED BY THE FIRST 
QUERY, I WOULD HAVE TO ISSUE AN 
ADDITIONAL QUERY AGAINST THE TAGS 
CONTAINER TO GET THE DETAILS OF 
TAGS. THIS WOULD LET ME CORRECTLY 
POPULATE ALL THE RESULTS I HAVE 
ON DISPLAY ON MY FRONT END. THIS 
IS GOING TO WORK. I MEAN DOING THAT, 
I WOULD BE ABLE TO FROM A FUNCTIONAL 
POINT OF VIEW, THIS IS GOING TO 
WORK. IT DOESN'T LOOK VERY GOOD. 
THE PROBABLY NOT VERY SCALABLE EITHER. 
THAT IS THE MOMENT WHERE EVERYBODY 
IN THE AUDIENCE WILL USUALLY GOING 
TO SCREAM, WHERE ARE MY JOINTS. 
I NEED TO DEW AJOINT OPERATION AGAINST 
PRODUCT CATEGORY AND TAGS. REMEMBER 
WHAT I SAID THEMING HAD THAT TODAY 
YOU CANNOT DO ON COSMOS DB. YOU 
CANNOT DO JOINTS ACROSS MULTIPLE 
CONTAIN ARE. WE ARE GOING TO INTRODUCE 
A LEVEL OF DENORMALLIZATION. WHENEVER 
YOU ARE WORKING WITH NONRELATIONAL 
DATABASES, USUALLY YOU OPTIMIZE 
YOUR DATA MODELS TO MAKE SURE ALL 
THE DATA YOU NEED TO RETURN IS READY 
TO BE SERVED. IN OUR EXAMPLE, WHENEVER 
I RETURN THE PRODUCT, I ALSO NEED 
TO RETURN THE CATEGORY NAME, I ALSO 
NEED TO RETURN THE NAMES OF THE 
TAGS. SO I'M GOING TO MAKE SURE 
THIS IS READY TO BE READ WHENEVER 
I'M QUERYING MY DATABASE. SO IN 
ORDER TO DO THAT, I AM GOING TO 
SLIGHTLY AUGMENT MY PRODUCT DOCUMENT. 
I'M GOING IT ADD THE CATEGORY NAME 
DIRECTLY IN THE PRODUCT. SO I'M 
GOING TO COPY MY -- THE NAME OF 
THE CATEGORY DIRECTLY WITHIN THE 
PRODUCT AND SAME GOES FOR THE TAGS. 
INSTEAD OF JUST STORING AN ARRAY 
OF TAG IDS I'M GOING TO STORE THE 
COMPLETE DETAIL OF OF THE TAGS SO 
I CAN ALSO GET THE NAMES. SO EVERY 
TIME I AM GOING TO CREATE A PRODUCT 
IN THE DATABASE, I AM GOING TO INJECT 
THOSE DIFFERENT PIECES OF DATA TO 
MAKE SURE I HAVE EVERYTHING I NEED 
IN THE DATABASE. DOING THAT, THEN, 
SWITCHING BACK TO MY AZURE PORTAL, 
I CAN NOW FETCH ALL THE PRODUCT 
FOR I SPECIFIC CATEGORY. THIS IS 
THE CATEGORY ID FOR MY ELECTRONICS 
CATEGORY AND I JUST RETURN THE TOP 
FIVE BECAUSE I DON'T WANT TO OVERWHELM 
YOU WITH RESULTS. BUT WHEN I DO 
THAT NOW, I GET PRODUCT AND I HAVE 
THE CATEGORY NAME AND IF EACH TAG, 
I HAVE THE ID AN NAME N JUST ONE 
QUERY I GET ALL THAT I NEED TO PUSH 
TO THE FRONT END TONIGHT -- AND 
TO DISPLAY IT THE USER. THIS IS 
GOING TO WORK WELL UNTIL A RENAME 
A CATEGORY F I DON'T DO ANYTHING 
ELSE, AS SOON AS I RENAME A CATEGORY, 
I WILL FALL IN SOME INCONSISTENCY. 
I WILL GET AN INCONSISTENCY BETWEEN 
THE CATEGORY AND RATED PRODUCT BECAUSE 
IF I DON'T DO ANYTHING ELSE, THE 
PROLED YOU CANS WILL STILL CARRY 
THE OLD VALUE OF THE CATEGORY NAME. 
SO WHAT I NEED TO DO IN ORDER TO 
SOLVE THAT, LOOKING AT OUR CURRENT 
DESIGN WHERE WE HAVE CATEGORIES, 
TAGS AND PRODUCT. EVERY TIME I UPDATE 
EITHER A CATEGORY OR A TAG, WHAT 
I NEED TO DO, I NEED TO PROPAGATE 
THAT RENAME TO THE PRODUCTS, TO 
THE CORRESPONDING PRODUCTS. WHEN 
I RENAME A CATEGORY, THAT NEW NAME 
SHOULD BE PROP GATED TO ALL THE 
PRODUCTS FROM THAT CATEGORY. TURNS 
OUT WE HAVE EXACTLY THE RIGHT FEATURE 
TO IMPLEMENT THAT ON COSMOS DB. 
EVERY COSMOS DB CONTAINER HAS AN 
API TO GET NOTIFIED EVERY TIME SOMETHING 
HAPPENS IN YOUR CONTAINER. EVERY 
TIME A DOCUMENT IS EITHER ADDED 
OR UPDATED IN A CONTAIN ARE THAT, 
DOCUMENT BECOMES PUBLISHED UNDER 
CHANGE API AND ANY PIECE OF CODE 
SUB VIED TO THE CHAIN FEED RECEIVED 
THE DOCUMENT TO DO WHATEVER YOU 
WANT TO DO. WE ARE GOING TO SUBSCRIBE 
TO THE CHANGE FEEL. EVERY TIME THERE 
IS AN UPDATE OF CATEGORY, WE WILL 
TRIGGER AN AZURE FUNCTION THAT WILL 
RETRIEVE THE NEW CATEGORY APPEAR 
WE WILL DO THE CONTACT SAME THING 
FOR THE TAGS. LET'S SEE THAT IN 
ACTION. SO WHAT I'M GOING TO DO 
FIRST, IMGOING TO DO POINT READ 
TO JUST RETRIEVE ONE PRODUCT FROM 
THE CATEGORY HEALTH. I GET MY PRODUCT. 
THE NAME IS LICENSED. SO IT'S BIT 
RANDOM. WHAT WE CAN SEE HERE IS 
IT BELONGS TO CATEGORY HEALTH. NEXT, 
I'M GOING TO RETRIEVE THE HEALTH 
CATEGORY BY DOING ANOTHER POINT 
READ SO THIS IS THE CATEGORY COMING 
BACK. I'M GOING TO CHANGE THE NAME 
FROM HEALTH TO HEALTH AND FITNESS 
AND I'M GOING TO DO A REPLACE ITEM. 
I AM JUST UPDATING MY CARRAWAY NEW 
NAME. I'M GOING TO EXECUTE THAT. 
HERE IS MY NEW CATEGORY. THAT IS 
BASICALLY THE PAYLOAD THAT MY ADS 
OUR FUNCTION WILL RECEIVE. WHAT 
I RECEIVE THAT PAYLOAD, KNOW THAT 
IS THE CATEGORY WITH THAT ID AS 
A NEW NAME. IT IS NOW CALLED HEALTH 
AND FITNESS. I AM GOING TO PROPAGATE 
THAT NEW NAME TO ALL THE PRODUCT 
T SHOULD BE ALREADY DONE SO IF I 
FETCH THE SAME PRODUCT AGAIN, YOU 
CAN SEE THAT IT IS STILL MY LICENSE 
OF SALAD BUT NOW THE CATEGORY NAME 
IS HEALTH AND FITNESS. BY JUST HOOKING 
A COUPLE OF AZURE FUNCTIONS TO THE 
CHANGE FEED OF MY CONTAINERS, I 
HAVE IMPLEMENTED A WAY TO GET CONSISTENCY 
ACROSS MY DIFFERENT CONTAIN ARE. 
WHENEVER I RENAME A CATEGORY, I 
DON'T HAVE TO REMEMBER THAT I HAVE 
ALSO TO RENAME ALL DIFFERENT PRODUCT. 
THIS IS SOMETHING BEING HANDLED 
AUTOMATICALLY IN A SYSTEMATIC MANNER 
UNDER THE HOOD BY THE AZURE FUNCTIONS. 
AM I ON TIME? PRETTY GOOD. NEXT, 
THE LAST ENTITY WE HAVE TO TAKE 
CARE OF. YOU KNOW THE DRILL. ILL 
ASTART WITH TWO DIFFERENT JSON DOCUMENTS 
ONE FOR THE SALES ORDER, ONE FOR 
THE SALES ORDER DETAILS? IS THERE 
ANY WAY IT OPTIMIZE THAT? WITH CAN 
EMBED. THIS IS YET ANOTHER ONE TO 
FEW RELATIONSHIP. IT MAKE SENSE 
TO PUT EVERYTHING IN THE SAME DOCUMENT. 
WE MATERIALIZE THE DETAILS. WE HAVE 
DETAILS ARRAY. WITHIN THE JSON PAYLOAD. 
WHY DOES IT MAKE SENSE. IT'S ONE 
TO FEW RELATION SHALL BUT ALSO WHEN 
I READ A SALES ORDER, I NEED OBVIOUSLY 
-- CERTAINLY WANT TO GET EVERYTHING 
RELATED TO THE SALE ORDER. I WANT 
ALL THE DETAILS, ALL THE PRODUCTS 
THAT HAVE BEEN SOLD. THAT NEW JSON 
DOCUMENT, I COULD STORE IT IN A 
NEW CONTAINER CALLED SALES ORDERS. 
COOING THE PARTITION KEY CAN ONLY 
BE DONE WITH WE LOOK AT THE DETAILS. 
THE SECOND ONE IS MORE IMPORTANT. 
LET'S FOCUS A BIT ON THE SECOND 
ONE. THE SALES ORDER FOR ONE CUSTOMER. 
THIS IS AN OPERATION THAT ANY CUSTOMER 
WANTS TO DO. YOU GO TO YOUR PROFILE, 
YOU WANT TO SEE YOUR RECENT PURCHASES. 
THIS IS NOT AN OPERATION THAT CAN 
BE DONE WITH THE FOLLOWING SEQUEL 
SERIES. THE CUSTOMER ID EQUALS CUSTOMER 
A. AS YOU PROBABLY UNDERSTAND FOR 
THAT THING TO BE A SINGLE PARTITION 
QUERY, YOU NEED ALL THE SALES ORDERS 
RELATED TO THE SAME CUSTOMER TO 
SIT IN THE SAME LOGICAL PARTITION. 
SO AGAIN, HERE, THE RIGHT CHOICE 
OF PARTITION KEY WOULD BE ACTUALLY 
THE FILTER WE ARE FILTERING ON WHICH 
IS CUSTOMER ID. PARTITIONING MY 
CONTAINER BUT BY CUSTOMER ID, I 
MAKE SURE IT IS VERY EFFICIENT TO 
RETRIEVE ALL THE SALES ORDERS FOR 
ONE CUSTOMER. BEFORE WE GO AHEAD, 
TAKE A STEP BACK AND LET'S LOOK 
AT THE DIFFERENT CONTAINERS THAT 
WE HAVE CREATE SO FAR. WE HAVE CREATED 
ANOTHER CONTAINER BEFORE, CUSTOMERS 
CONTAINING OUR CUSTOMER AN WE PARTITIONED 
THAT THING BY IDF YOU THINK ABOUT 
T THAT OTHER CONTAINER, CUSTOMER, 
IS ALSO PARTITIONED BY CUSTOMER 
ID BECAUSE ID OF THE CUSTOMER THE 
CUSTOMER ID. SO HERE WE HAVE TWO 
CONTAINERS THAT ARE GOING TO BE 
PARTITIONED ON THE SAME DIMENSION. 
SO WOULD IT MAKE SENSE TO PUT CUSTOMERS 
AN SALES ORDERS INTO THE SAME CONTAINER? 
IT IS GOING TO HURT YOUR BRAIN. 
SO MIXING ENTITIES INTO THE SAME 
CONTAINER, FIRST CLARIFY THAT THIS 
IS SOMETHING THAT IS TECHNICALLY 
PROBABLY BUZZ WE LEVERAGE THE AGNOSTIC 
NATURE OF THE DATABASE LIKE COSMOS 
DB. CAN YOU WRITE ANY SHAPE OF JSON 
YOU WANT. SO NOT ONLY IS IT TECHNICAL 
POSSIBLE BUT IT IS A BEST PRACTICE. 
IT IS ONE GOOD WAY TO OPTIMIZE YOUR 
MODEL BY PUTTING TOGETHER DIFFERENT 
ENTITY TYPE INTO THE SAME CONTAINER. 
GOOD REASONS, GOOD TYPICAL REASONS 
TO DO THAT, IT IS SUITABLE WHEN 
YOUR DIFFERENT ENTITY TYPES SHARE 
THE SAME ACCESS PATTERNS. MANY TIMES 
WHEN I'M FETCHING A CUSTOMER, I 
ALSO WANT TO FETCH THE CORRESPONDING 
SALES ORDERS. IT IS ALSO SUITABLE 
LIKE IN OUR CASE WHEN THOSE DIFFERENT 
ENTITY TYPES SHARE THE SAME PARTITION 
KEY. SO WHAT I'M GOING TO DO INSTEAD 
OF HAVING SEPARATE CON TOWNS ARE 
FOR CUSTOMERS AND SALES ORDERS, 
I AM GOING TO PUT BOTH OF THEM IN 
MY ORIGINAL CUSTOMER'S CONTAINER. 
DOING THAT, 199HE HAD TO SLIGHTLY 
ADJUST THE PARTITION KEY OF MY CUSTOMERS 
CONTAIN ARE. I WILL CAPING IT FROM 
ID TO CUSTOMER ID AN TASS A FIELD 
I'M GOING TO ADD IN MY ORIGINAL 
CUSTOMERS DEFAMATION BECAUSE THE 
PARTITION KEY A PROPERTY I KENNY 
TO FIND IN ALL OF THE DOCUMENTS 
I'M GOING TO WRITE INTO THE CONTAINER. 
IN THE CASE OF THE CUSTOMER, ID 
AN CUSTOMER ID ARE GOING TO HOLD 
THE SAME VALUE BUT THAT IS FINE. 
THAT IS ACTUALLY HOW WE IMPLEMENT 
THOSE PATTERNS. I AM GOING TO ADD 
ANOTHER ADDITIONAL FIELD THAT IS 
TYPE. I WILL HAVE TYPE EQUAL SALES 
ORDERS FOR SALES ORDERS. THE REASON 
WHY I'M ADDING THAT ADDITIONAL PROPERTY, 
IT GIVES ME A WAY TO DISTINGUISH 
BETWEEN THE TO ENTITY TYPES. WHEN 
I QUERY MY CONTAINER, I WILL ABLE 
TO GET ONE OR THE OTHER. SO JUST 
TO CHAIR IDENTIFY, DOING THAT, I 
WILL END UP WITH A CONTAINER WHERE 
EACH LOGICAL PARTITION WILL CONTAIN 
ONE CUSTOMER AND ALL THE SALES ORDERS 
RELATED TO THAT CUSTOMER. SO NOW, 
IN ORDER TO FETCH ALL THE SALES 
ORDERS FOR ONE CUSTOMER, THE SEQUEL 
WHERE THE CUSTOMER ID EQUALSS MY 
CUSTOMER AND ALSO FILTERING ON THE 
TYPE TO GET SALE ORDERS. THIS IS 
SOMETHING I'M GOING TO EXECUTE AGAINST 
THE CUSTOMERS CONTAINER. LET'S DO 
THAT. SO THIS IS THE EXACT SAME 
QUERY I JUST MENTIONED. THE CUSTOMER 
ID IS THE CUSTOMER AND THE TYPE 
EQUALS SALES ORDERS. YOU GET ALL 
THE SALES ORDERS IF THAT PARTICULAR 
CUSTOMER ID. THAT ONE DID A LOT 
OF PURCHASES. ONE INTERESTING THING 
HERE IS THAT BY DOING THAT, BY MIXING 
CUSTOMERS AND SALES ORDERS INTO 
THE SAME CONTAINER, WE KIND OF GOT 
OUR JOINTS BACK. SO IF I HAD THE 
REQUIREMENT TO ISSUE JUST ONE QUERY 
TO GET THE CUSTOMER AND THE CORRESPONDING 
SALES ORDERS, THIS IS NOT SOMETHING 
THAT I CAN WRITE WITH JUST ONE QUERY. 
IF I ISSUED THE QUERY HERE THAT 
ONLY FILTERS ON THE CUSTOMER ID, 
THIS WILL RETURN EVERYTHING RELATED 
TO CUSTOMER, RIGHT? SO BOTH THE 
CUSTOMER ITSELF AND THE SALE ORDERS. 
NOT TO MAKE SURE THAT MY FIRST RESULT 
IS A CUSTOMER, THE LITTLE TRICK 
IS THAT I CAN ORDER BY TYPE AND 
BECAUSE CUSTOMERS STARTS WAY C AND 
SALES ORDER STARTS WITH AN S, DOING 
THAT, NOW, I WILL -- I AM GUARANTEEING 
THAT MY FIRST RESULT WILL BE MY 
CUSTOMER, MR. BOOK AND THEN THE 
NEXT ONE WILL BEL SALES ORDERS RELATED 
TO THE CUSTOMER. SO YOU EWE EVENTUALLY 
GOT SOME OF OUR JOINT CAPABILITIES 
BACK JUST BY MIXING DIFFERENT ENTITY 
INTO THE SAME CONTAINER. I AM GOING 
TO SCROLL DOWN HERE. HE PURCHASE 
A LOT OF STUFF. AND NOW FOR FINISH, 
STILL ON OUR SALES ORDER, WE HAVE 
A LAST IMPORTANT OPERATION WE NEED 
TO BE ABLE TO MANAGE. THAT LAST 
OPERATION IS LISTING OLLAV ALL ALL 
OF OUR CUSTOMER BY DESENDING ORDER 
IMWANT TO SEE WHAT ARE MY TOP CUSTOMERS. 
WHO ARE CUSTOMERS WHO HAVE PURCHASED 
THE MOST ON MOI E- COMMERCE PLATFORM 
UNTIL ORDER TO ONE SUCH A QUERY, 
FOR EACH CUSTOMER, I SHOULD COUNT 
HOW MANY SALES ORDERS HE OR SHE 
HAS CREATE APPEAR AND THEN I SHUT 
SORT THOSE COUNT BY DESCENDING ORDER 
AND RETURN TO CORRESPONDING CUSTOMERS. 
THIS IS SOMETHING THAT TODAY YOU 
CANNOT EXPRESS IN THE SEQUEL DIALECT 
THAT IS SPOKES POSSESSED BY COSMOS 
DB. THERE IS FOR WAY IT WRITE JUST 
ONE QUERY THAT WOULD ACTUALLY DIRECTLY 
RETURN THAT. THE WAY WE ARE GOING 
TO AGAIN WORK AROUND THAT LIMITATION 
IS TO USE OUR FRIEND THE DENORMALLIZATION. 
WHAT I'M HE GOING TO DO IS TO ADD 
YET ANOTHER PROPERTY IN MY CUSTOMERS. 
I AM GOING TO MATERIAL EYED THE 
COUNT OF SALES ORDERS IN THAT DOCUMENT 
AND THEN I WILL ABLE TO DO A SORT 
ON THAT PARTICULAR PROPERTY IN ORDER 
TO GET THE RESULTS I WANT. SO WHAT 
I WANT TO ACHIEVE BACK TO OUR CUSTOMERS 
CONTAINER IS THAT EVERY TIME I'M 
ADD AGNAILS ORDER I WANT AT THE 
SAME TIME TO UPDATE THE CUSTOMER 
TO INCREMENT MY SALES ORDER COUNT. 
THING WE CAN USE, THE COSMOS DB 
FEATURE WE CAN USE IS TO LEVERAGE 
OUR PROCEDURES. THEY ARE WRITTEN 
IN JAVA SCRIPT AND WE CONNECT EXECUTE 
THEM ON THE BACK END. ONE THEIR 
SCOPE OF EXECUTION IS ONE LOGICAL 
PARTITION. SO ONE STOP PROCEDURE 
CAN UPDATE EVERYTHING THAT IS CONTAINED 
IN ONE LOGICAL PARTITION AND IT 
WILL DO THAT UPDATE TRANSACTIONALLY 
WHICH IS RAILROAD IMPORTANT FOR 
US HERE. SO IN MY CASE, INSTEAD 
OF JUST CREATING A NEW DOCUMENT 
WHEN I'M ADDING A NEW SALES ORDER, 
I AM GOING TO PASS THAT DOCUMENT 
TO MY PROCEDURE SO TAKING THE NEW 
ORDER AS A PARAMETER. WHAT THE PROCEDURE 
WILL DO FIRST GOING TO READ THE 
CUSTOMER. THEN IT WILL INCREMENT 
THE SALES ORDER COUNT, THAT NEW 
PROPERTY WE HAVE ADDED. IT WILL 
REPLACE THE CUSTOMER SO THIS IS 
UPDATING THE CUSTOMER AND FINALLY, 
IT WILL DO A CREATE DOCUMENT TO 
ACTUALLY CREATE THE ORDER IN THE 
ORDER IN THE SAME LOGICAL PARTITION. 
NOW, REMEMBER THAT PROCEDURE I STARTED, 
ONLY TWO OPTIONS POSSIBLE. EITHER 
MY JAVA SCRIPT FUNCTION WILL SUCCEED 
AND BOTH WILL BE COMMITTED OR IT 
FAILS FOR SOME REASON AND THEN ANY 
OF THOSE RIGHTS WILL BE ROLLED BACK. 
I AM GUARANTEED I WILL HAVE STRONG 
CONSISTENCY BETWEEN THE ACTUAL COUNT 
OF SALES ORDERS AND THAT NEW FIELD 
I HAVE ADDED IN MY CUSTOMERS. SO 
NOW, THE QUERY BECOME WHERE THE 
TYPE EQUALS CUSTOMER ORDERED BY 
THAT NEW PROPERTY WE HAVE ADDED, 
DESENDING ORDER. LET'S DO THAT AGAIN 
ON MY NOTEBOOK. THIS IS THE EXACT 
SAME QUERY. I AM JUST DOING THE 
TOP 10. I HAVE THOUSANDS OF THE 
CUSTOMERS. I HE CAN UTAH THAT THING 
AND HERE WE GET THE ID OF MY TOP 
BUYING CUSTOMERS WITH THEIR CORRESPONDING 
SALES ORDER COUNT. LET'S HAVE A 
QUICK LOOK AT THE REQUEST CHARGE 
FOR THAT THING. REMEMBER BEFORE 
WE WERE REPUBLICAN A SINGLE PARTITION 
QUERY AND THAT THING RETURN IN JUST 
3. 5RUS. HERE, THE QUERY WAS A LITTLE 
BIT MORE EXPENSIVE. DOES ANYBODY 
KNOW WHY? BECAUSE THISES WITH A 
CROSS PARTITION QUERY, RIGHT? LOOKING 
BACK, IT IS FELT ARING ON TYPE, 
ORDERING BY SALES ORDER COUNT BUT 
THE PARTITION KEY IS CUSTOMER ID. 
WE DIDN'T FILTER ON CUSTOMER ID 
SO THIS IS TYPICALLY A CROSS PARTITION 
QUERY. AS EXPECTED, THAT THING WAS 
MORE EXPENSIVE THAN A SINGLE PARTITION 
QUERY BECAUSE IT EVENTUALLY HIT 
MULTIPLE PHYSICAL SOBERS. THIS IS 
TYPICALLY A SITUATION WHERE IT IS 
FINE. CROSS PARTITION QUERIES ARE 
NOT EVIL. I HAVE NEVER SEEN A REAL 
WORLD EXAMPLE WHERE ALL THE QUERIES 
ARE SINGLE PARTITION I WOULD ARGUE 
THAT WOULD BE OVERENGINEERING MY 
DATA MODEL. I'M FINE WITH THAT QUERY 
MAYBE BEING A BIT SLOWER. THIS IS 
MANAGE I WILL EXECUTE ONE ARE ONCE 
A DAY, ONCE A WEEK SO I DON'T CARE 
IF IT IS SHOWER AND MORE EXPENSIVE. 
ONE LESS TON TO LEARN HERE, DON'T 
TRY TO END UP WITH A DESIGN THAT 
WILL BE-YEAR-OLD A SINGLE PARTITION 
QUERIES EVERYWHERE. FIRST, IT WILL 
BE A MAJOR HEADACHE IF YOU HAVE 
A REAL WORLD EXAMPLE AND ALSO BECAUSE 
IT WOULD BE OBVIOUS OVERENGINEERING. 
DON'T TRY TO OVEROPTIMIZE AND JUST 
OPTIMIZE THE MOST IMPORTANT OPERATIONS 
THAT YOU HAVE. LOOKING AT OUR FINAL 
DESIGN, SO WE ENDED UP RIGHT NOW, 
WE HAVE FOUR DIFFERENT CONTAINERS. 
ONE CONTAINING CUSTOMERS AND SALES 
ORDERS, ONE FOR THE PRODUCT, ONE 
FOR THE TAGS, ONE FOR THE CATEGORY. 
THERE IS ONE FINAL LAST LITTLE OPTIMIZATION 
WE CAN DO HERE. DOES ANYONE HAVE 
ANY INTUITION HOW WE CAN FURTHER 
OPTIMIZE THE DESIGN? SORRY? TAGS 
AND CATEGORIES AS WE MENTIONED BEFORE. 
GOOD CANDIDATES TO MERGE DIFFERENT 
ENTITY TYPES INTO THE SAME CONTAINERS 
ARE CONTAINERS THAT SHARE THE SAME 
PARTITION KEY VALUE. THAT IS THE 
CASE HERE. WE HAVE TAGS AND CATEGORIES 
THAT ARE BOTH PARTITIONED BY TYPE. 
SO ONE WAY TO FURTHER OPTIMIZE THE 
DESIGN IS TO PUT THEM TOGETHER IN 
THE SAME CONTAINER. SO EVENTUALLY 
WE WOULD BE ABLE TO MATERIALIZE 
OUR LITTLE E- COMMERCE DATA MODEL 
WITH THREE COSMOS DB CONTAINER. 
KEY TAKEAWAYS. ALWAYS START YOUR 
MODELING EXERCISE BY IDENTIFYING 
WHICH ARE THE MOST IMPORTANT OPERATIONS 
THAT YOUR DATA MODEL NEED FOR TULL 
FILL. WITHOUT THAT STEP, IF YOU 
DON'T HAVE APPEAR INITIAL UNDERSTANDING 
OF WHICH ARE THE MOST IMPORTANT 
OPERATIONS THAT MY MODEL WILL HAVE 
TO FULFILL, THERE IS NO WAY FOR 
YOU TO TAKE THE BEST DECISIONS BECAUSE 
THOSE ARE -- THAT IS THE INFORMATION 
THAT WILL GUIDE YOUR CHOICES OF 
SHOULD I EMBED, SHOULD I REFERENCE? 
SHOULD I PUT THE ENTITIES IN DIFFERENT 
CONTAINER OR THE SAME CONTAINER. 
SHOULD I COLOCATE? SHOULD I DENORMALIZE? 
YOU CAN ONLY ANSWER THOSE IF YOU 
HAVE WHAT ARE THE OPERATIONS YOU 
NEED TO DESIGN FOR. PARTITIONING 
CRITICAL. MAKE SHOULD YOU ARE UNDERSTAND 
WHAT PARTITIONING IS AND HOW IT 
WORKS. IT IS NOT THAT COMPLICATED 
BUT ONCE YOU HAVE A GRASP OF WHAT 
IS PARTITIONING UNDER THE HOOD THEN 
MAKE SURE YOU ARE CORRECTLY LEVERAGING 
PARTITIONING. THERE ARE DIFFERENT 
WAYS TO MATERIALIZE RELATIONSHIPS 
BETWEEN DIFFERENT ENTITIES IN COSMOS 
DB. YOU CAN EMBED LIKE WE EMBEDDED 
CUSTOMER ADDRESSES IN CUSTOMERS. 
CAN YOU DENORMALIZE AND AGGREGATE 
LIKE WE DID WITH TAGS AND PRODUCTS. 
WE KIND OF PREAGGREGATED THE PRODUCT. 
YOU CAN ALL STORE RELATED ENTITIES 
INTO THE SAME CONTAINER JUST LIKE 
WE DID BETWEEN THE CUSTOMER AND 
THE SALES ORDERS. COSMOS DB EXPOSS 
DIFFERENT METHODS FOR YOU TO IMPLEMENT. 
DENORMALIZATION, THE CHANGE FIELD 
WAS ONE OF THEM. IT'S VERY POPULAR 
ONE AND DON'T BE SCARED OF USING 
THE CHANGE FEED. SOMETIMES PEOPLE 
THINK THAT IS ANOTHER MOVING PART 
WITHIN MY OVERALL ARCHITECTURE AND 
DATA MODEL. WE GIVE YOU WAYS TO 
CONSUME AND PROPOSE SETS CHANGE 
FEED IN A VERY RELIABLE MANNER. 
WE GIVE YOU GUARANTEES THAT THE 
CHANGE FIELD WILL BE PROCESS VERY 
RELIABLY. AND MITIGATING THE RISK 
OF EVENTUALLY HAVING INCONSISTENCIES 
IN YOUR DATA MODEL. ANOTHER WAY 
WE EXPLORED WAS STORE PROCEDURES 
WHICH IS A VERY POWERFUL WAY IF 
YOU NEED TO DENORMALIZE DATA. THAT 
FIRST LINK IS A LINK TO THE REPOSITORY 
THAT IS EMPTY RIGHT NOW. I AM IN 
THE PROCESS OF WRITING A FULL IMPLEMENTATION 
OF EVERYTHING I JUST PRESENTED AND 
EVEN MUCH MORE. BECAUSE THOSE IGNITE 
SECOND- DEGREES VERY SHORT. WE ONLY 
HAVE 45 MINUTES SO I HAD TO LIMIT 
THE DIFFERENT OPERATIONS I WANTED 
TO SHARE WITH YOU. I AM WORKING 
ON THE FULL IMPLEMENTATION AND SHOW 
YOU FOR EXAMPLE HOW CAN YOU SEARCH 
PRODUCTS BY TAGS WHICH IS OBVIOUS 
WILL YOU A VERY COMMON THING THAT 
YOU EXPECT TO DO ON AN E-COMMERCE 
PLATFORM. THIS IS ONE OF THE THINGS 
I WILL COVER. I HOPE TO BE ABLE 
TO PUBLISH THAT IN THE COMING WEEKS 
OR MONTHS. WHAT YOU CAN DO IN THE 
MEANTIME, GET DO GET HUB YOU WATCH 
THE REST REBOSSTORY SO YOU GET NOTIFIED 
WHEN I'M ABLE IT PUBLISH. THE SECOND 
LIVEN MAY BE OF INTEREST FOR YOU. 
THIS IS A LINK TO OUR DOCUMENTATION 
THAT PRESENTS ANOTHER KIND. USE 
CASE. THOSE OF YOU HAVE SEEN MY 
PREVIOUS TALKS, I USUALLY TAKE THE 
EXAMPLE AFTER I BLOGGING PLATFORM 
BEING ASOM PLATFORM. THIS IS A LINK 
TO THE DOCUMENTATION THAT ACTUALLY 
COVERS THAT USE CASE WITH THE SAME 
PROCESS OF EXPLORING WHICH CONTAINS 
THAT WE NEED. HOW DO WE MODEL AND 
PARTITION OUR DATA IN SUCH A SITUATION. 
WE ALSO HAVE A SIDE WEB SITE CALLED 
GOT COSMOS. COM WHERE WE PUT EVEN 
MORE SAMPLES AND WORKSHOPS AND ADDITIONAL 
CONTENT. THAT IS ALL FOR ME TODAY. 
THANK YOU VERY MUCH FOR YOUR ATTENTION. 
I WILL STICK AROUND IF AS MUCH TIME 
AS NEEDED TO MAKE SURE I ADDRESS 
