Forwarded from Laziz Abdullaev
This media is not supported in your browser
VIEW IN TELEGRAM
States of Mind
"Bir aylanib kelsam yaxshiroq g'oyalar kela boshlaydi", - kabi gaplarni ko'pchilikdan eshitgansiz.
Aslida bu tasodif emas. Miyamizda Default Mode Network (DMN) deb ataluvchi neural tarmoq e'tiboringizni tor topshiriqqa qaratmay erkin qo'yganingizda miyangizdagi o'y-xayollarga mas'ul bo'ladi.
Faraz qiling miyangizda juda ko'p g'oyalar mavjud (states). Ammo, ikki g'oya orasidagi bog'lanish aniq emas. Berilgan masalani ishlash uchun esa o'sha state'lar ichidan yechimga olib boruvchi yo'l (path) topish kerak.
Ongli ravishda unday katta grafdan kerakli boshlang'ich nuqtani topish ham, keyingi to'g'ri state'ni tanlash ham mushkul (yechim qotib qoldi).
"Bir aylanib kelganda" esa DMN miya state'lari orasida tasodifiy yurish (random walk) amalga oshiradi. Bu esa o'z aqlingiz bilan tanlashingiz ehtimoli kam bo'lgan state'larning tanlanish ehtimolini orttiradi.
Natijada, tasodifan boshi berk ko'chadan chiqish yo'li ko'rina boshlaydi.
___
Large Reasoning Models (LRM) ham shunga tabiatan o'xshash usulda boshqarilishi mumkin. Bunda keyingi so'z bashorat qilinayotgan (next token prediction) taqsimot sun'iy ravishda o'tkir (peaky) taqsimotdan yoyiqroq (uniform) taqsimotga temperature annealing usulida o'tkaziladi. Bu esa tanlanishi mumkin bo'lgan keyingi so'zlar to'plamini ancha kengaytirishga xizmat qiladi.
@lazizabdullaev
"Bir aylanib kelsam yaxshiroq g'oyalar kela boshlaydi", - kabi gaplarni ko'pchilikdan eshitgansiz.
Aslida bu tasodif emas. Miyamizda Default Mode Network (DMN) deb ataluvchi neural tarmoq e'tiboringizni tor topshiriqqa qaratmay erkin qo'yganingizda miyangizdagi o'y-xayollarga mas'ul bo'ladi.
Faraz qiling miyangizda juda ko'p g'oyalar mavjud (states). Ammo, ikki g'oya orasidagi bog'lanish aniq emas. Berilgan masalani ishlash uchun esa o'sha state'lar ichidan yechimga olib boruvchi yo'l (path) topish kerak.
Ongli ravishda unday katta grafdan kerakli boshlang'ich nuqtani topish ham, keyingi to'g'ri state'ni tanlash ham mushkul (yechim qotib qoldi).
"Bir aylanib kelganda" esa DMN miya state'lari orasida tasodifiy yurish (random walk) amalga oshiradi. Bu esa o'z aqlingiz bilan tanlashingiz ehtimoli kam bo'lgan state'larning tanlanish ehtimolini orttiradi.
Natijada, tasodifan boshi berk ko'chadan chiqish yo'li ko'rina boshlaydi.
___
Large Reasoning Models (LRM) ham shunga tabiatan o'xshash usulda boshqarilishi mumkin. Bunda keyingi so'z bashorat qilinayotgan (next token prediction) taqsimot sun'iy ravishda o'tkir (peaky) taqsimotdan yoyiqroq (uniform) taqsimotga temperature annealing usulida o'tkaziladi. Bu esa tanlanishi mumkin bo'lgan keyingi so'zlar to'plamini ancha kengaytirishga xizmat qiladi.
@lazizabdullaev
π₯21π3β€1
Got a Google offer
...and I rejected it.
I interviewed with Google around a year ago, passed all their 4 interview rounds, and received an offer for their Munich office.
Total compensation was about 20% higher than what I was making in Berlin. But Munich is at least 20% more expensive.
Moving for essentially the same compensation didn't make sense. I'm happy with my work at Amazon, and the Google offer just wasn't compelling enough to justify the move.
...and I rejected it.
I interviewed with Google around a year ago, passed all their 4 interview rounds, and received an offer for their Munich office.
Total compensation was about 20% higher than what I was making in Berlin. But Munich is at least 20% more expensive.
Moving for essentially the same compensation didn't make sense. I'm happy with my work at Amazon, and the Google offer just wasn't compelling enough to justify the move.
π₯20π8β€3
Google'dan offer oldim
...va rad etdim.
Taxminan bir yil oldin Google bilan intervyu o'tkazdim, ularning 4 ta intervyu bosqichini muvaffaqiyatli o'tdim va Myunxen ofisi uchun ish taklifi (offer) oldim.
Umumiy taklif qilingan maosh Berlindagi maoshimdan taxminan 20% yuqori edi. Lekin Myunxen yashash xarajatlari Berlindan kamida 20% qimmat.
Deyarli bir xil maosh uchun Myunxenga ko'chish mantiqiy emas edi. Men Amazondagi ishimdan mamnunman va Google taklifi ko'chishni oqlaydigan darajada jozibador ko'rinmadi.
...va rad etdim.
Taxminan bir yil oldin Google bilan intervyu o'tkazdim, ularning 4 ta intervyu bosqichini muvaffaqiyatli o'tdim va Myunxen ofisi uchun ish taklifi (offer) oldim.
Umumiy taklif qilingan maosh Berlindagi maoshimdan taxminan 20% yuqori edi. Lekin Myunxen yashash xarajatlari Berlindan kamida 20% qimmat.
Deyarli bir xil maosh uchun Myunxenga ko'chish mantiqiy emas edi. Men Amazondagi ishimdan mamnunman va Google taklifi ko'chishni oqlaydigan darajada jozibador ko'rinmadi.
π₯42π13π10β€4
Designing systems for 'unknown unknowns'
Part 1
πΊπΈ
In large scale distributed systems, anything can fail at some point. So as part of the system design, engineers are expected to list the anticipated failure modes and how their design mitigate and recover from them. So we cover the failure modes that we know with automated tests.
However, that is limited by only what is known. When the system is deployed to the wild, there will anyways be stuff that fails in unexpected ways, meaning failure is inevitable because of unknown unknowns.
What do we do with them?
We acknowledge we cannot prevent all failures with tests and now what matters for these failures is how fast we recover from them. There are standard metrics that measure this more formally:
- MTTD (mean time to detection): how quickly do we find out something failed?
- MTTA (mean time to acknowledgement): how quickly do we start responding?
- MTTR (mean time to recovery): how quickly we recover.
For example,
The smaller these numbers are, the more resilient your system is to failures.
How do we make these numbers smaller?
I will answer that in Part 2.
#distributedsystems #operations #aws #operationalexcellence #testing
Part 1
πΊπΈ
In large scale distributed systems, anything can fail at some point. So as part of the system design, engineers are expected to list the anticipated failure modes and how their design mitigate and recover from them. So we cover the failure modes that we know with automated tests.
However, that is limited by only what is known. When the system is deployed to the wild, there will anyways be stuff that fails in unexpected ways, meaning failure is inevitable because of unknown unknowns.
What do we do with them?
We acknowledge we cannot prevent all failures with tests and now what matters for these failures is how fast we recover from them. There are standard metrics that measure this more formally:
- MTTD (mean time to detection): how quickly do we find out something failed?
- MTTA (mean time to acknowledgement): how quickly do we start responding?
- MTTR (mean time to recovery): how quickly we recover.
For example,
MTTD = sum(detection_time_i - incident_start_time_i) / number of incidents.
The smaller these numbers are, the more resilient your system is to failures.
How do we make these numbers smaller?
I will answer that in Part 2.
#distributedsystems #operations #aws #operationalexcellence #testing
π₯5π3β€1
'Noma'lum noma'lumlar' uchun sistema qurish
1-qism
πΊπΏ
Katta hajmdagi taqsimlangan (distributed) sistemalarda ixtiyoriy komponent qachondir ishdan chiqishi mumkin. Shuning uchun sistema arxitekturasini qurish jarayonida muhandislardan ehtimoliy nosozlik turlarini sanab o'tish va ularning arxitekturalari bu nosozliklarni qanday oldini olishi va tiklanishini ko'rsatishi kutiladi. Shunday qilib, biz ma'lum bo'lgan nosozlik turlarini avtomatlashtirilgan testlar bilan qamrab olamiz.
Biroq, bu bilan biz faqat ma'lum bo'lgan (biz bilgan, kutgan) nosozliklarni oldini olishimiz mumkin. Tizim real muhitga joylashtirilganda, kutilmagan joylardan ishdan chiqadi, ya'ni noma'lum noma'lumlar tufayli nosozlik hodisalari muqarrar.
Ularni nima qilamiz?
Biz barcha nosozliklarni testlar bilan oldini ololmasligimizni tan olamiz va endi bu nosozliklar uchun muhim narsa ulardan qanchalik tez tiklanishimizdir. Buni rasmiyroq o'lchaydigan standart ko'rsatkichlar (metrikalar) mavjud:
- MTTD (mean time to detection: aniqlashning o'rtacha vaqti): nimadir ishdan chiqqanini qanchalik tez bilamiz?
- MTTA (mean time to acknowledgement - tan olishning o'rtacha vaqti): javob berishni qanchalik tez boshlaymiz?
- MTTR (mean time to recovery - tiklanishning o'rtacha vaqti): qanchalik tez tiklanamiz.
Masalan,
Bu raqamlar qanchalik kichik bo'lsa, tizimingiz nosozliklarga shunchalik bardoshli bo'ladi.
Bu raqamlarni qanday kichiklashtirish mumkin?
Bunga 2-qismda javob beraman.
#distributedsystems #operations #aws #operationalexcellence #testing
1-qism
πΊπΏ
Katta hajmdagi taqsimlangan (distributed) sistemalarda ixtiyoriy komponent qachondir ishdan chiqishi mumkin. Shuning uchun sistema arxitekturasini qurish jarayonida muhandislardan ehtimoliy nosozlik turlarini sanab o'tish va ularning arxitekturalari bu nosozliklarni qanday oldini olishi va tiklanishini ko'rsatishi kutiladi. Shunday qilib, biz ma'lum bo'lgan nosozlik turlarini avtomatlashtirilgan testlar bilan qamrab olamiz.
Biroq, bu bilan biz faqat ma'lum bo'lgan (biz bilgan, kutgan) nosozliklarni oldini olishimiz mumkin. Tizim real muhitga joylashtirilganda, kutilmagan joylardan ishdan chiqadi, ya'ni noma'lum noma'lumlar tufayli nosozlik hodisalari muqarrar.
Ularni nima qilamiz?
Biz barcha nosozliklarni testlar bilan oldini ololmasligimizni tan olamiz va endi bu nosozliklar uchun muhim narsa ulardan qanchalik tez tiklanishimizdir. Buni rasmiyroq o'lchaydigan standart ko'rsatkichlar (metrikalar) mavjud:
- MTTD (mean time to detection: aniqlashning o'rtacha vaqti): nimadir ishdan chiqqanini qanchalik tez bilamiz?
- MTTA (mean time to acknowledgement - tan olishning o'rtacha vaqti): javob berishni qanchalik tez boshlaymiz?
- MTTR (mean time to recovery - tiklanishning o'rtacha vaqti): qanchalik tez tiklanamiz.
Masalan,
MTTD = sum(aniqlanish_vaqti_i - hodisa_boshlanish_vaqti_i) / hodisalar soni.
Bu raqamlar qanchalik kichik bo'lsa, tizimingiz nosozliklarga shunchalik bardoshli bo'ladi.
Bu raqamlarni qanday kichiklashtirish mumkin?
Bunga 2-qismda javob beraman.
#distributedsystems #operations #aws #operationalexcellence #testing
π₯8π4β€1
Designing systems for 'unknown unknowns'
Part 2
πΊπΈ
In Part 1, I talked about the fact that unknown unknowns are inevitable. We cannot forecast them upfront, so the resilience of a system to these failures comes from how fast we detect, acknowledge, and recover when the system fails.
One way to deal with unknown unknowns is fault injection, in which engineers intentionally introduce failures into a system and observe how it behaves.
Different companies use different approaches for injecting failures to their system. Netflix developed Chaos Monkey, a tool that randomly terminates production servers to force services to tolerate server failures.
At AWS, this idea is applied through a game day process. A game day is an exercise where failures are simulated and teams respond using the same tools and processes they would use during a real incident, exposing gaps in detection, response, and recovery.
This is directly related to the metrics discussion in Part 1. During a game day, you can measure how long it takes before someone notices something is wrong (MTTD), how long it takes before someone actively starts responding (MTTA), and how long it takes to mitigate or recover (MTTR).
For example, in one of the game days I led, I introduced an artificial delay in a database client, which caused messages to pile up in a message queue. Error rates did not change, and the system looked "healthy" from the dashboards. After a few hours, the issue was eventually noticed through a report from a dependent team whose data was no longer getting refreshed.
That game day identified that the team would lose most of the time in detection (poor MTTD) and acknowledgement (poor MTTA), resulting in poor MTTR. As a result, we took action items to improve the system based on our learnings: we added an alarm on the age of the oldest message in the queue, turning the unknown unknown into a known unknown. The next time there is high database latency in production, the system is now able to detect it early.
Each game day surfaces gaps, and fixing them improves how the system responds next time.
VoilΓ , you have just improved your MTTD, MTTA, and MTTR!
@abdullaevdev
#distributedsystems #operations #aws #operationalexcellence #testing
Part 2
πΊπΈ
In Part 1, I talked about the fact that unknown unknowns are inevitable. We cannot forecast them upfront, so the resilience of a system to these failures comes from how fast we detect, acknowledge, and recover when the system fails.
One way to deal with unknown unknowns is fault injection, in which engineers intentionally introduce failures into a system and observe how it behaves.
Different companies use different approaches for injecting failures to their system. Netflix developed Chaos Monkey, a tool that randomly terminates production servers to force services to tolerate server failures.
At AWS, this idea is applied through a game day process. A game day is an exercise where failures are simulated and teams respond using the same tools and processes they would use during a real incident, exposing gaps in detection, response, and recovery.
This is directly related to the metrics discussion in Part 1. During a game day, you can measure how long it takes before someone notices something is wrong (MTTD), how long it takes before someone actively starts responding (MTTA), and how long it takes to mitigate or recover (MTTR).
For example, in one of the game days I led, I introduced an artificial delay in a database client, which caused messages to pile up in a message queue. Error rates did not change, and the system looked "healthy" from the dashboards. After a few hours, the issue was eventually noticed through a report from a dependent team whose data was no longer getting refreshed.
That game day identified that the team would lose most of the time in detection (poor MTTD) and acknowledgement (poor MTTA), resulting in poor MTTR. As a result, we took action items to improve the system based on our learnings: we added an alarm on the age of the oldest message in the queue, turning the unknown unknown into a known unknown. The next time there is high database latency in production, the system is now able to detect it early.
Each game day surfaces gaps, and fixing them improves how the system responds next time.
VoilΓ , you have just improved your MTTD, MTTA, and MTTR!
@abdullaevdev
#distributedsystems #operations #aws #operationalexcellence #testing
π₯6π1
'Noma'lum noma'lumlar' uchun sistema qurish
2-qism
πΊπΏ
1-qismda men noma'lum noma'lumlarning muqarrarligi haqida yozgan edim. Biz ularni oldindan bashorat qila olmaymiz, shuning uchun tizimning bu turdagi nosozliklarga bardoshliligi tizim ishdan chiqqanda buni qanchalik tez aniqlashimiz, tan olishimiz va tiklashimizga bog'liq.
Noma'lum noma'lumlar bilan kurashishning usullaridan biri nosozlik kiritish (fault injection) bo'lib, bunda muhandislar ataylab tizimga nosozliklar kiritadilar va buning natijasida sistemaning ishlashini kuzatadilar.
Turli kompaniyalar o'z tizimlariga nosozlik kiritish uchun turli yondashuvlardan foydalanadilar. Misol uchun, Netflix Chaos Monkey (betartib maymun) dasturini ishlab chiqqan. Bu vosita servislarning server nosozliklariga bardoshliliini tekshirish uchun, ishlab turgan serverlarni tasodifiy ravishda to'xtatadi.
AWS'da esa bu g'oya game day jarayoni (o'yin kuni) orqali amalga oshiriladi. Game day da nosozliklar simulyatsiya qilinadi va jamoalardan mavjud bo'lgan vositalar va jarayonlardan foydalangan holda nosozlikni bartaraf etish so'raladi. Ushbu jarayon jamoadagi nosozlikni aniqlash, unga javob berish va tiklanishdagi kamchiliklarni ochib beradi.
Bu 1-qismdagi ko'rsatkichlar muhokamasi bilan bevosita bog'liq. Game day davomida kimdir nimadir noto'g'ri ekanligini sezmagunicha qancha vaqt ketishini (MTTD), kimdir javob berishni boshlamagunicha qancha vaqt ketishini (MTTA) va nosozlikni bartaraf etish yoki tiklash uchun qancha vaqt ketishini (MTTR) o'lchashga yordam beradi.
Masalan, men boshqargan game day'lardan birida ma'lumotlar bazasi klientiga sun'iy kechikish (delay) kiritdim, bu esa xabarlar navbatida xabarlar to'planib qolishiga sabab bo'ldi. Xatolik darajasi o'zgarmadi va tizim dashboard'lardan "sog'lom" ko'rinardi. Bir necha soatdan so'ng, muammo bizning sistemani ishlatuvchi boshqa bir jamoadan kelgan xabar orqali sezildi: ularning ma'lumotlari yangilanmayotgan ekan.
O'sha game day jamoa vaqtning ko'p qismini aniqlash (yomon MTTD) va tan olishda (yomon MTTA) yo'qotishini aniqladi, natijada yomon MTTR hosil bo'ldi. Game day dan so'ng, biz o'rganilgan narsalarimizga asoslanib, tizimni yaxshilash uchun vazifalarni aniqladik: navbatdagi eng eski xabarning yoshiga signal (alarm) qo'shdik, ya'ni noma'lum noma'lumni ma'lum noma'lumga aylantirdik. Keyingi safar ishlab turgan muhitda (production'da) ma'lumotlar bazasi ga ulanishda kechikish ko'rsatkichlari yuqori bo'lgan holda, tizim endi uni erta aniqlashga qodir.
Har bir game day kamchiliklarni yuzaga chiqaradi va ularni tuzatish tizimning keyingi safar qanday javob berishini yaxshilaydi.
Mana, siz hozirgina MTTD, MTTA va MTTR'ingizni yaxshiladingiz!
@abdullaevdev
#distributedsystems #operations #aws #operationalexcellence #testing
2-qism
πΊπΏ
1-qismda men noma'lum noma'lumlarning muqarrarligi haqida yozgan edim. Biz ularni oldindan bashorat qila olmaymiz, shuning uchun tizimning bu turdagi nosozliklarga bardoshliligi tizim ishdan chiqqanda buni qanchalik tez aniqlashimiz, tan olishimiz va tiklashimizga bog'liq.
Noma'lum noma'lumlar bilan kurashishning usullaridan biri nosozlik kiritish (fault injection) bo'lib, bunda muhandislar ataylab tizimga nosozliklar kiritadilar va buning natijasida sistemaning ishlashini kuzatadilar.
Turli kompaniyalar o'z tizimlariga nosozlik kiritish uchun turli yondashuvlardan foydalanadilar. Misol uchun, Netflix Chaos Monkey (betartib maymun) dasturini ishlab chiqqan. Bu vosita servislarning server nosozliklariga bardoshliliini tekshirish uchun, ishlab turgan serverlarni tasodifiy ravishda to'xtatadi.
AWS'da esa bu g'oya game day jarayoni (o'yin kuni) orqali amalga oshiriladi. Game day da nosozliklar simulyatsiya qilinadi va jamoalardan mavjud bo'lgan vositalar va jarayonlardan foydalangan holda nosozlikni bartaraf etish so'raladi. Ushbu jarayon jamoadagi nosozlikni aniqlash, unga javob berish va tiklanishdagi kamchiliklarni ochib beradi.
Bu 1-qismdagi ko'rsatkichlar muhokamasi bilan bevosita bog'liq. Game day davomida kimdir nimadir noto'g'ri ekanligini sezmagunicha qancha vaqt ketishini (MTTD), kimdir javob berishni boshlamagunicha qancha vaqt ketishini (MTTA) va nosozlikni bartaraf etish yoki tiklash uchun qancha vaqt ketishini (MTTR) o'lchashga yordam beradi.
Masalan, men boshqargan game day'lardan birida ma'lumotlar bazasi klientiga sun'iy kechikish (delay) kiritdim, bu esa xabarlar navbatida xabarlar to'planib qolishiga sabab bo'ldi. Xatolik darajasi o'zgarmadi va tizim dashboard'lardan "sog'lom" ko'rinardi. Bir necha soatdan so'ng, muammo bizning sistemani ishlatuvchi boshqa bir jamoadan kelgan xabar orqali sezildi: ularning ma'lumotlari yangilanmayotgan ekan.
O'sha game day jamoa vaqtning ko'p qismini aniqlash (yomon MTTD) va tan olishda (yomon MTTA) yo'qotishini aniqladi, natijada yomon MTTR hosil bo'ldi. Game day dan so'ng, biz o'rganilgan narsalarimizga asoslanib, tizimni yaxshilash uchun vazifalarni aniqladik: navbatdagi eng eski xabarning yoshiga signal (alarm) qo'shdik, ya'ni noma'lum noma'lumni ma'lum noma'lumga aylantirdik. Keyingi safar ishlab turgan muhitda (production'da) ma'lumotlar bazasi ga ulanishda kechikish ko'rsatkichlari yuqori bo'lgan holda, tizim endi uni erta aniqlashga qodir.
Har bir game day kamchiliklarni yuzaga chiqaradi va ularni tuzatish tizimning keyingi safar qanday javob berishini yaxshilaydi.
Mana, siz hozirgina MTTD, MTTA va MTTR'ingizni yaxshiladingiz!
@abdullaevdev
#distributedsystems #operations #aws #operationalexcellence #testing
π₯7π6π1
Correction of Error (COE)
πΊπΈ
Even with strong detection and prevention mechanisms, failures still happen. For systems where reliability and availability matter, teams need a consistent way to learn from incidents and prevent them from repeating.
At Amazon, this is done by an established and publicly documented process called Correction of Error (COE) (click the link to learn more).
After an incident is fully mitigated (important: mitigate first, ask questions later), teams break down the timeline, identify the real root causes, quantify the impact, and assign concrete action items to prevent recurrence. When there is a COE, it is the highest priority over any feature delivery, and teams get pre-allocated annual capacity buffers for writing potential COEs.
Importantly, a COE is not about blaming individuals or punishment. Its purpose is to create maximum visibility into improvement areas and to reward open learning so teams address weaknesses in systems and processes rather than hide them.
I am fortunate to have owned and written 3 COEs during my time at Amazon so far, and they were some of the best learning experiences I've had as an engineer.
#operationalexcellence #operations #coe #reliability #availability
πΊπΈ
Even with strong detection and prevention mechanisms, failures still happen. For systems where reliability and availability matter, teams need a consistent way to learn from incidents and prevent them from repeating.
At Amazon, this is done by an established and publicly documented process called Correction of Error (COE) (click the link to learn more).
After an incident is fully mitigated (important: mitigate first, ask questions later), teams break down the timeline, identify the real root causes, quantify the impact, and assign concrete action items to prevent recurrence. When there is a COE, it is the highest priority over any feature delivery, and teams get pre-allocated annual capacity buffers for writing potential COEs.
Importantly, a COE is not about blaming individuals or punishment. Its purpose is to create maximum visibility into improvement areas and to reward open learning so teams address weaknesses in systems and processes rather than hide them.
I am fortunate to have owned and written 3 COEs during my time at Amazon so far, and they were some of the best learning experiences I've had as an engineer.
#operationalexcellence #operations #coe #reliability #availability
π6β€2π₯2
Xatoliklarni Tuzatish (COE)
πΊπΏ
Kuchli aniqlash va oldini olish mexanizmlari mavjud bo'lgan taqdirda ham, sistemalarda nosozliklar yuz beradi. Ishonchlilik (reliability) va mavjudlik (availability) muhim bo'lgan tizimlar uchun jamoalar hodisalardan dars olish va ularning takrorlanishini oldini olishning izchil yo'liga muhtoj.
Amazonda bu rasmiy va ochiq hujjatlashtirilgan jarayon, Correction of Error (COE) orqali amalga oshiriladi (havolani bosib batafsil bilib olishingiz mumkin).
Hodisa to'liq bartaraf etilgandan so'ng (muhim: avval bartaraf et, savollarni keyinroq ber), jamoalar hodisalar ketma-ketligini tahlil qiladi, asosiy sabablarni (root causes) aniqlaydi, ta'sirni baholaydi va takrorlanishning oldini olish uchun aniq vazifalar (action items) tayinlaydi. COE mavjud bo'lganda, u sistemaga har qanday yangi xususiyatlar ustida ishlash (feature delivery) dan ustun turadi va jamoalarga yillik rejalarida COE yozish uchun oldindan vaqt ajratiladi.
Muhimi, COE insonlarni ayblash yoki jazolash haqida emas. Uning maqsadi jamoalar tizim va jarayonlardagi kamchiliklarni yashirishdan ko'ra ochiq muhokama qilishini rag'batlantirish va yaxshilanish sohalarini ko'rinishga chiqarishdir.
Hozirgacha, Amazondagi ish faoliyatim davomida 3 ta COE yozdim. Ular menga muhandis sifatida eng yaxshi o'sish va o'rganish tajribalaridan bo'lib qoldi.
#operationalexcellence #operations #coe #reliability #availability
πΊπΏ
Kuchli aniqlash va oldini olish mexanizmlari mavjud bo'lgan taqdirda ham, sistemalarda nosozliklar yuz beradi. Ishonchlilik (reliability) va mavjudlik (availability) muhim bo'lgan tizimlar uchun jamoalar hodisalardan dars olish va ularning takrorlanishini oldini olishning izchil yo'liga muhtoj.
Amazonda bu rasmiy va ochiq hujjatlashtirilgan jarayon, Correction of Error (COE) orqali amalga oshiriladi (havolani bosib batafsil bilib olishingiz mumkin).
Hodisa to'liq bartaraf etilgandan so'ng (muhim: avval bartaraf et, savollarni keyinroq ber), jamoalar hodisalar ketma-ketligini tahlil qiladi, asosiy sabablarni (root causes) aniqlaydi, ta'sirni baholaydi va takrorlanishning oldini olish uchun aniq vazifalar (action items) tayinlaydi. COE mavjud bo'lganda, u sistemaga har qanday yangi xususiyatlar ustida ishlash (feature delivery) dan ustun turadi va jamoalarga yillik rejalarida COE yozish uchun oldindan vaqt ajratiladi.
Muhimi, COE insonlarni ayblash yoki jazolash haqida emas. Uning maqsadi jamoalar tizim va jarayonlardagi kamchiliklarni yashirishdan ko'ra ochiq muhokama qilishini rag'batlantirish va yaxshilanish sohalarini ko'rinishga chiqarishdir.
Hozirgacha, Amazondagi ish faoliyatim davomida 3 ta COE yozdim. Ular menga muhandis sifatida eng yaxshi o'sish va o'rganish tajribalaridan bo'lib qoldi.
#operationalexcellence #operations #coe #reliability #availability
π9π₯5π1
Average is not all you need
πΊπΈ
We use averages everywhere in daily life. Average temperature, commute time, meat prices at the market. They work great for normally distributed data.
But some things aren't normally distributed. Think about wealth: 9 people with ~$50K net worth, one person walks in with $1M. Average jumps to $145K. This is technically correct, but practically useless.
Latency is more like wealth than temperature. 9 requests complete in 10ms, 1 request takes 10 seconds, average being ~1 second: looks fine. But one real customer just waited 10 (!) seconds.
So what do we use instead?
Read the full article here:
https://abdullaev.dev/average-is-not-all-you-need/
P.S. You need more statistics.
@abdullaevdev
#monitoring #observability #statistics #aws
πΊπΈ
We use averages everywhere in daily life. Average temperature, commute time, meat prices at the market. They work great for normally distributed data.
But some things aren't normally distributed. Think about wealth: 9 people with ~$50K net worth, one person walks in with $1M. Average jumps to $145K. This is technically correct, but practically useless.
Latency is more like wealth than temperature. 9 requests complete in 10ms, 1 request takes 10 seconds, average being ~1 second: looks fine. But one real customer just waited 10 (!) seconds.
So what do we use instead?
Read the full article here:
https://abdullaev.dev/average-is-not-all-you-need/
P.S. You need more statistics.
@abdullaevdev
#monitoring #observability #statistics #aws
abdullaev.dev
Average is not all you need
We use averages everywhere in daily life. The average temperature in summer, your average commute time to work, average meat prices at the market, etc. But are they good enough for software metrics?
β€6π4π₯3
O'rtacha qiymat yetarli emas
πΊπΏ
Kundalik hayotda o'rtacha qiymatdan hamma joyda foydalanamiz. O'rtacha harorat, ishga borish vaqti, bozorlardagi go'sht narxlari. Ular normal taqsimlangan ma'lumotlar uchun juda yaxshi ishlaydi.
Lekin ba'zi narsalar normal taqsimlanmagan. Boylik haqida o'ylang: 9 nafar odam ~$50K sof boylikka ega, 1 kishi $1M bilan xonaga kiradi. O'rtacha qiymat $145K ga sakraydi. Bu texnik jihatdan to'g'ri, lekin amalda foydasiz.
Sistema tezligi (latency) haroratga emas, boylikka o'xshaydi. 9 ta so'rov 10ms da tugaydi, 1 ta so'rov 10 soniya oladi, o'rtacha ~1 soniya: yaxshi ko'rinadi. Lekin bir haqiqiy mijoz 10 (!) soniya kutdi.
Xo'sh, buning o'rniga nimadan foydalanamiz?
To'liq maqolani bu yerda o'qing:
https://abdullaev.dev/average-is-not-all-you-need/
P.S. Ko'proq statistika kerak.
@abdullaevdev
#monitoring #observability #statistics #aws
πΊπΏ
Kundalik hayotda o'rtacha qiymatdan hamma joyda foydalanamiz. O'rtacha harorat, ishga borish vaqti, bozorlardagi go'sht narxlari. Ular normal taqsimlangan ma'lumotlar uchun juda yaxshi ishlaydi.
Lekin ba'zi narsalar normal taqsimlanmagan. Boylik haqida o'ylang: 9 nafar odam ~$50K sof boylikka ega, 1 kishi $1M bilan xonaga kiradi. O'rtacha qiymat $145K ga sakraydi. Bu texnik jihatdan to'g'ri, lekin amalda foydasiz.
Sistema tezligi (latency) haroratga emas, boylikka o'xshaydi. 9 ta so'rov 10ms da tugaydi, 1 ta so'rov 10 soniya oladi, o'rtacha ~1 soniya: yaxshi ko'rinadi. Lekin bir haqiqiy mijoz 10 (!) soniya kutdi.
Xo'sh, buning o'rniga nimadan foydalanamiz?
To'liq maqolani bu yerda o'qing:
https://abdullaev.dev/average-is-not-all-you-need/
P.S. Ko'proq statistika kerak.
@abdullaevdev
#monitoring #observability #statistics #aws
abdullaev.dev
Average is not all you need
We use averages everywhere in daily life. The average temperature in summer, your average commute time to work, average meat prices at the market, etc. But are they good enough for software metrics?
π₯9π3
Before you apply to Amazon, read this
πΊπΈ
Amazon is not for everyone.
I have been at Amazon for close to 4 years, and it is one of the best places I have worked. But it has a very specific character, and I have seen good engineers struggle here not because they lacked skill, but because the intense environment wasn't the right fit for them.
Let me tell you what that looks like.
You will own everything.
You don't only own code here. You own a product. Writing code is literally 1 of the 10+ other things you are responsible for: your service's infrastructure, monitoring, oncall, design docs, customer communications, etc. There is no "that's not my job" here. If your system breaks at 2am on a Saturday, that's your problem to fix. If you want to write code and hand it off to someone else: ops, QA, DevOps, Amazon will frustrate you quickly.
Startup mindset.
Despite being one of the largest companies in the world, Amazon operate like 'the world's largest startup'. Resources are not handed to you: you fight for compute, you justify your infrastructure costs, you build with what you have. If you are expecting the comfort of a big corporate environment, you will be surprised (also no free lunches like you see on Software Engineering day YouTube videos).
You will be held to a high bar; continuously.
The interview is just the beginning, and probably the easiest step. The same rigor that got you in is expected every day after. Code reviews, design docs, COEs, operational reviews: everything gets scrutinized. A senior engineer once told me:
That is more true at Amazon than anywhere I have worked.
Ambiguity is the default.
Your manager will not tell you what to do. You are expected to figure out the problem, the solution, and convince everyone it's right. It is your job to create clarity and disambiguate a business problem. You are expected to drive every open question to a conclusion, even the uncomfortable ones.
Oncall follows you home.
If I am the oncall for my team this week and if something breaks, I am the first to know and the first to respond, day, night or weekend. This is not optional, it rotates across the team. Some people find this energizing, others find it exhausting. Know which one you are before you join.
If you read this and felt excited: you will love working at Amazon.
If you felt anxious: it is worth reflecting on that before you apply.
@abdullaevdev
#amazon #aws #career #faang
πΊπΈ
Amazon is not for everyone.
I have been at Amazon for close to 4 years, and it is one of the best places I have worked. But it has a very specific character, and I have seen good engineers struggle here not because they lacked skill, but because the intense environment wasn't the right fit for them.
Let me tell you what that looks like.
You will own everything.
You don't only own code here. You own a product. Writing code is literally 1 of the 10+ other things you are responsible for: your service's infrastructure, monitoring, oncall, design docs, customer communications, etc. There is no "that's not my job" here. If your system breaks at 2am on a Saturday, that's your problem to fix. If you want to write code and hand it off to someone else: ops, QA, DevOps, Amazon will frustrate you quickly.
Startup mindset.
Despite being one of the largest companies in the world, Amazon operate like 'the world's largest startup'. Resources are not handed to you: you fight for compute, you justify your infrastructure costs, you build with what you have. If you are expecting the comfort of a big corporate environment, you will be surprised (also no free lunches like you see on Software Engineering day YouTube videos).
You will be held to a high bar; continuously.
The interview is just the beginning, and probably the easiest step. The same rigor that got you in is expected every day after. Code reviews, design docs, COEs, operational reviews: everything gets scrutinized. A senior engineer once told me:
"success has many fathers, but failure has none."
That is more true at Amazon than anywhere I have worked.
Ambiguity is the default.
Your manager will not tell you what to do. You are expected to figure out the problem, the solution, and convince everyone it's right. It is your job to create clarity and disambiguate a business problem. You are expected to drive every open question to a conclusion, even the uncomfortable ones.
Oncall follows you home.
If I am the oncall for my team this week and if something breaks, I am the first to know and the first to respond, day, night or weekend. This is not optional, it rotates across the team. Some people find this energizing, others find it exhausting. Know which one you are before you join.
If you read this and felt excited: you will love working at Amazon.
If you felt anxious: it is worth reflecting on that before you apply.
@abdullaevdev
#amazon #aws #career #faang
π₯10π4β€2
Amazon'ga topshirishdan oldin, buni o'qing
πΊπΏ
Amazon hamma uchun emas.
Men Amazon'da ishlayotganimga deyarli 4 yil bo'ldi va bu men ishlagan eng yaxshi joylardan biri. Lekin bu yerda ish o'ziga xos xarakterga ega, va men bu yerda yaxshi muhandislar qiynalganini ko'rdim: mahorat yetishmagani uchun emas, balki bu intensiv muhit ularga mos kelmagani uchun.
Keling, bu qanday ko'rinishini aytib beraman.
Siz hamma narsaga javobgarsiz.
Bu yerda faqat kodga emas, balki butun mahsulotga javobgarsiz. Kod yozish siz mas'ul bo'lgan 10+ ta boshqa ishlardan atigi bittasi: servisingiz infratuzilmasi, monitoring, oncall, dizayn hujjatlari, mijozlar bilan muloqot va hokazo. Bu yerda "bu mening ishim emas" degan gap yo'q. Agar sizning sistemangiz shanba kuni soat 2da ishdan chiqsa, bu sizning muammoingiz. Agar siz faqat kod yozib, qolganini boshqalarga ops, QA, DevOps ishlarni topshirishni istasangiz, Amazonda ishlash sizni tezda hafsalangizni pir qilishi mumkin.
Startup kabi ishlash.
Dunyodagi eng yirik kompaniyalardan biri bo'lishiga qaramay, Amazon o'zini "dunyodagi eng yirik startup" sifatida tutadi. Resurslar sizga shunchaki berib qo'yilmaydi: siz serverlar uchun kurashΠ°siz, infratuzilma xarajatlaringizni asoslaysiz, mavjud narsalar bilan qurasiz. Agar katta korporativ muhitning qulayligini kutayotgan bo'lsangiz, ajablanasiz (shuningdek, YouTube videolarida ko'rganingizdek ofisda bepul tushliklar yo'q).
Siz doimo yuqori talabga javob berishingiz kerak.
Ishga qabul qilishdagi intervyu bu faqat boshlang'ich, va ehtimol nisbatan eng oson qadam. Ishga kirayotganingizdagi qat'iylik har kuni talab qilinadi. PR sharhlari, dizayn hujjatlari, COElar, operatsion sharhlar: hamma narsa sinchiklab tekshiriladi. Bir kuni senior muhandis menga shunday degan edi:
Bu Amazon'da men ishlagan boshqa har qanday joyga qaraganda ko'proq to'g'ri.
Noaniqlik odatiy holat.
Menejeringiz sizga nima qilishni aytmaydi. Siz muammoni, yechimni topishingiz va hammani bu to'g'ri ekaniga ishontirishingiz kerak. Biznes muammosini aniqlashtirish va ravshanlashtirish: bu sizning vazifangiz. Har bir ochiq savolni, hatto noqulay bo'lganlarini ham, oxiriga yetish sizdan kutiladi.
Navbatchilik sizni uygacha kuzatib boradi.
Agar bu hafta jamoamda navbatchi (oncall) bo'lsam va biror narsa buzilsa, birinchi biluvchi va birinchi javob beruvchi men bo'laman, kunmi, tunmi, dam olish kunimi. Bu majburiy, jamoa bo'ylab navbatma-navbat o'tadi. Ba'zilar buni quvvat bag'ishlovchi deb biladi, boshqalar esa charchatuvchi. Amazon'ga qo'shilishdan oldin o'zingiz qaysi toifadanligingizni biling.
Buni o'qib hayratlangan bo'lsangiz: Amazon'da ishlash sizga yoqadi.
Xavotir his qilgan bo'lsangiz: topshirishdan oldin o'ylab ko'rish o'rinli bo'ladi.
@abdullaevdev
#amazon #aws #career #faang
πΊπΏ
Amazon hamma uchun emas.
Men Amazon'da ishlayotganimga deyarli 4 yil bo'ldi va bu men ishlagan eng yaxshi joylardan biri. Lekin bu yerda ish o'ziga xos xarakterga ega, va men bu yerda yaxshi muhandislar qiynalganini ko'rdim: mahorat yetishmagani uchun emas, balki bu intensiv muhit ularga mos kelmagani uchun.
Keling, bu qanday ko'rinishini aytib beraman.
Siz hamma narsaga javobgarsiz.
Bu yerda faqat kodga emas, balki butun mahsulotga javobgarsiz. Kod yozish siz mas'ul bo'lgan 10+ ta boshqa ishlardan atigi bittasi: servisingiz infratuzilmasi, monitoring, oncall, dizayn hujjatlari, mijozlar bilan muloqot va hokazo. Bu yerda "bu mening ishim emas" degan gap yo'q. Agar sizning sistemangiz shanba kuni soat 2da ishdan chiqsa, bu sizning muammoingiz. Agar siz faqat kod yozib, qolganini boshqalarga ops, QA, DevOps ishlarni topshirishni istasangiz, Amazonda ishlash sizni tezda hafsalangizni pir qilishi mumkin.
Startup kabi ishlash.
Dunyodagi eng yirik kompaniyalardan biri bo'lishiga qaramay, Amazon o'zini "dunyodagi eng yirik startup" sifatida tutadi. Resurslar sizga shunchaki berib qo'yilmaydi: siz serverlar uchun kurashΠ°siz, infratuzilma xarajatlaringizni asoslaysiz, mavjud narsalar bilan qurasiz. Agar katta korporativ muhitning qulayligini kutayotgan bo'lsangiz, ajablanasiz (shuningdek, YouTube videolarida ko'rganingizdek ofisda bepul tushliklar yo'q).
Siz doimo yuqori talabga javob berishingiz kerak.
Ishga qabul qilishdagi intervyu bu faqat boshlang'ich, va ehtimol nisbatan eng oson qadam. Ishga kirayotganingizdagi qat'iylik har kuni talab qilinadi. PR sharhlari, dizayn hujjatlari, COElar, operatsion sharhlar: hamma narsa sinchiklab tekshiriladi. Bir kuni senior muhandis menga shunday degan edi:
"muvaffaqiyatning ko'p otasi bor, lekin muvaffaqiyatsizlikda ota yo'q."
Bu Amazon'da men ishlagan boshqa har qanday joyga qaraganda ko'proq to'g'ri.
Noaniqlik odatiy holat.
Menejeringiz sizga nima qilishni aytmaydi. Siz muammoni, yechimni topishingiz va hammani bu to'g'ri ekaniga ishontirishingiz kerak. Biznes muammosini aniqlashtirish va ravshanlashtirish: bu sizning vazifangiz. Har bir ochiq savolni, hatto noqulay bo'lganlarini ham, oxiriga yetish sizdan kutiladi.
Navbatchilik sizni uygacha kuzatib boradi.
Agar bu hafta jamoamda navbatchi (oncall) bo'lsam va biror narsa buzilsa, birinchi biluvchi va birinchi javob beruvchi men bo'laman, kunmi, tunmi, dam olish kunimi. Bu majburiy, jamoa bo'ylab navbatma-navbat o'tadi. Ba'zilar buni quvvat bag'ishlovchi deb biladi, boshqalar esa charchatuvchi. Amazon'ga qo'shilishdan oldin o'zingiz qaysi toifadanligingizni biling.
Buni o'qib hayratlangan bo'lsangiz: Amazon'da ishlash sizga yoqadi.
Xavotir his qilgan bo'lsangiz: topshirishdan oldin o'ylab ko'rish o'rinli bo'ladi.
@abdullaevdev
#amazon #aws #career #faang
π₯18π±4π2β€1
"Kompyuter ishlatishni bilmay" Amazonga ishga kirgan yigit
2017-yil yozda men litseyda o'qishni bitirdim. Bir necha kundan so'ng oliy o'quv yurtlariga hujjat qabul qilish jarayonlari boshlangan edi. Menga eng qiziq bo'lgan 2 universitetlar:
1) O'zbekiston Milliy universisteti va
2) Toshkent shahridagi Inha universiteti edi.
O'sha paytlar, davlat universitetlarida hujjatlar offline tarzda, universitetga borib topshirilar edi. Milliy universitetiga hujjat topshirishga bordim. Hujjatlarimni qabul qilgan kishi hujjatarimni olgach 2-komputer xonaga borishimni aytdi, ariza kompyuterda elektron to'ldirilar ekan.
Komputer oldiga o'tirib, "hamma qatori" arizani to'ldirmoqchi bo'ldim, lekin men o'tirgan komputerda ariza formasi ochilmadi. Boyagi kishiga aytsam, u menga bergan savol: "Kompyuter ishlatishni bilasanmi o'zi?" bo'ldi. Vaholanki, men 7 yoshimdan kompyuter bilan "katta bo'lgan" bola edim. Ajablanib, "uzr aka, ochilmayabdi" deb qo'ya qoldim. U ham kelib, arizani ocha olmadi, va ishlamayabdi, boshqa kompyuterga o'tir dedi.
...Shu tariqa, bir amallab, hujjatlarimni topshirib, uyga kelib, Inha Universitetiga hujjatlarimni online topshirdim.
Ikkala universitetda ham kirish imtihonlari o'tdi, va men ikkalasiga ham grant asosida o'qishga qabul qilindim va Inha Universitetida o'qishga qaror qildim.
Hujjatlarimning originalini Milliy universitetiga topshirganim uchun, u yerdan olib Inha Universitetiga berishim kerak edi. Hujjatlarni olish uchun Milliy universitetga borganimda, hujjatlarni qabul qilgan kishi o'tirgan ekan. Va u meni o'qishdan yiqilgan deb o'ylab, so'radi: "necha ball yetmadi". Men unga 1-chi o'rin grant bilan kirganimni, lekin hujjatlarimni olishga kelganimni aytdim.
Xullas, Inha Universitetida o'qidim va bitirdim. Yillar o'tib, o'sha "kompyuter ishlatishni bilmagan" yigit hozir Amazonda dasturchi bo'lib ishlayabdi.
@abdullaevdev
#career #faang #amazon #aws
2017-yil yozda men litseyda o'qishni bitirdim. Bir necha kundan so'ng oliy o'quv yurtlariga hujjat qabul qilish jarayonlari boshlangan edi. Menga eng qiziq bo'lgan 2 universitetlar:
1) O'zbekiston Milliy universisteti va
2) Toshkent shahridagi Inha universiteti edi.
O'sha paytlar, davlat universitetlarida hujjatlar offline tarzda, universitetga borib topshirilar edi. Milliy universitetiga hujjat topshirishga bordim. Hujjatlarimni qabul qilgan kishi hujjatarimni olgach 2-komputer xonaga borishimni aytdi, ariza kompyuterda elektron to'ldirilar ekan.
Komputer oldiga o'tirib, "hamma qatori" arizani to'ldirmoqchi bo'ldim, lekin men o'tirgan komputerda ariza formasi ochilmadi. Boyagi kishiga aytsam, u menga bergan savol: "Kompyuter ishlatishni bilasanmi o'zi?" bo'ldi. Vaholanki, men 7 yoshimdan kompyuter bilan "katta bo'lgan" bola edim. Ajablanib, "uzr aka, ochilmayabdi" deb qo'ya qoldim. U ham kelib, arizani ocha olmadi, va ishlamayabdi, boshqa kompyuterga o'tir dedi.
...Shu tariqa, bir amallab, hujjatlarimni topshirib, uyga kelib, Inha Universitetiga hujjatlarimni online topshirdim.
Ikkala universitetda ham kirish imtihonlari o'tdi, va men ikkalasiga ham grant asosida o'qishga qabul qilindim va Inha Universitetida o'qishga qaror qildim.
Hujjatlarimning originalini Milliy universitetiga topshirganim uchun, u yerdan olib Inha Universitetiga berishim kerak edi. Hujjatlarni olish uchun Milliy universitetga borganimda, hujjatlarni qabul qilgan kishi o'tirgan ekan. Va u meni o'qishdan yiqilgan deb o'ylab, so'radi: "necha ball yetmadi". Men unga 1-chi o'rin grant bilan kirganimni, lekin hujjatlarimni olishga kelganimni aytdim.
Xullas, Inha Universitetida o'qidim va bitirdim. Yillar o'tib, o'sha "kompyuter ishlatishni bilmagan" yigit hozir Amazonda dasturchi bo'lib ishlayabdi.
@abdullaevdev
#career #faang #amazon #aws
π₯39π14β€3π2
Honestly, this means a lot. I write to share my knowledge, and knowing someone finds value in it makes every hour spent worth it.
β
Bu rostdan ko'p narsani anglatadi. Men bilimlarimni ulashish uchun yozaman, va buning sizlarga foydali bo'layotganini ko'rsam, sarflagan har bir soatimga arziydi.
β
Bu rostdan ko'p narsani anglatadi. Men bilimlarimni ulashish uchun yozaman, va buning sizlarga foydali bo'layotganini ko'rsam, sarflagan har bir soatimga arziydi.
π―21π₯10β€7
You are not paid to write code
πΊπΈ
Hearing this may hurt you if you are a software engineer, but honestly: you're paid to make customers happy, increase profits (so that you can get paid) and maybe to decrease your AWS costs. The code is just one of the ingredients to bring that value.
However, most engineers do it in reverse order and optimize for the wrong thing. They are trained to think in how. How to build it, how to scale it, how to make it extensible. But sometimes the right answer is not a new system at all: it can be a config change, a 15-minute meeting, or just saying "no" to the requirement.
It feels like engineering, but it is just solving the wrong problem built on assumptions.
Two traps that pull engineers to this wrong direction:
Trap 1: Falling in love with the technology. You learned Rust, or a new framework. Then you unconsciously look for problems that fit your preferred tools, instead of identifying the right problem and then looking for a tool to solve it. That tool can be code, a spreadsheet or even a hoe...
Trap 2: Building for the requirements you assumed. You build a general-purpose system when today's requirement is just one specific thing. Generalizations are useless until you have a second real use case. Until then, you're just building for a future you invented. Ask yourself: if you write Java or any OOP language, how many of your interfaces ever got a second implementation? Or did they just sit there, adding complexity, making the codebase harder to change without making a single customer happier?
The fix is to work backwards.
Before writing any code, write down the problem in plain language, what's broken, why fixing it matters, what value will your customers get? No solutions, nor framework choices, just focus on the problem.
At Amazon, we work backwards (see more here) by writing PR/FAQs, design docs and one-pagers. Different formats, but each focusing and working backwards from one thing: the customer problem.
Because at the end of the day, you're not paid to write code. You're paid to solve the right problems, with or without it.
@abdullaevdev
#engineering #softwareengineering #career #faang #aws
πΊπΈ
Hearing this may hurt you if you are a software engineer, but honestly: you're paid to make customers happy, increase profits (so that you can get paid) and maybe to decrease your AWS costs. The code is just one of the ingredients to bring that value.
However, most engineers do it in reverse order and optimize for the wrong thing. They are trained to think in how. How to build it, how to scale it, how to make it extensible. But sometimes the right answer is not a new system at all: it can be a config change, a 15-minute meeting, or just saying "no" to the requirement.
It feels like engineering, but it is just solving the wrong problem built on assumptions.
Two traps that pull engineers to this wrong direction:
Trap 1: Falling in love with the technology. You learned Rust, or a new framework. Then you unconsciously look for problems that fit your preferred tools, instead of identifying the right problem and then looking for a tool to solve it. That tool can be code, a spreadsheet or even a hoe...
Trap 2: Building for the requirements you assumed. You build a general-purpose system when today's requirement is just one specific thing. Generalizations are useless until you have a second real use case. Until then, you're just building for a future you invented. Ask yourself: if you write Java or any OOP language, how many of your interfaces ever got a second implementation? Or did they just sit there, adding complexity, making the codebase harder to change without making a single customer happier?
The fix is to work backwards.
Before writing any code, write down the problem in plain language, what's broken, why fixing it matters, what value will your customers get? No solutions, nor framework choices, just focus on the problem.
At Amazon, we work backwards (see more here) by writing PR/FAQs, design docs and one-pagers. Different formats, but each focusing and working backwards from one thing: the customer problem.
Because at the end of the day, you're not paid to write code. You're paid to solve the right problems, with or without it.
@abdullaevdev
#engineering #softwareengineering #career #faang #aws
π₯9π3β€2π€©1
Sizga kod yozish uchun oylik to'lashmaydi
πΊπΏ
Bu sizni xafa qilishi mumkin, ammo haqiqat shuki: siz mijozlarni xursand qilish, daromadni oshirish (shu jumladan o'z maoshingizni ta'minlash) va ehtimol AWS xarajatlarini kamaytirish uchun pul olasiz. Kod esa bu natijalarga erishish yo'llaridan biri xolos.
Biroq, aksariyat muhandislar buni teskari tartibda qiladi va noto'g'ri narsani optimallashtiradi. Ko'pchilik qanday degan savolga javob qidirishni o'ylashga o'rgangan. Qanday qurish, qanday kengaytirish, qanday yozish. Lekin ba'zida to'g'ri javob umuman yangi tizim emas: bu konfiguratsiyada o'zgarishi, 15 daqiqalik uchrashuv yoki talabga shunchaki "yo'q" deyish bo'lishi mumkin.
Bu muhandislikka o'xshaydi, lekin aslida farazlarga asoslangan noto'g'ri muammoni hal qilishdir.
Muhandislarni noto'g'ri yo'nalishga tortadigan ikki tuzoq:
1-tuzoq: Texnologiyaga oshiq bo'lish. Siz Rust yoki yangi frameworkni o'rgandingiz. To'g'ri muammoni aniqlab, keyin uni hal qilish uchun bir vosita izlash o'rniga, o'z-o'zidan muammolarni sevimli vositalaringizga moslab izlay boshlaysiz. O'sha bir vosita kod bo'lishi mumkin, jadval, yoki kerak bo'lsa ketmon...
2-tuzoq: O'zingiz faraz qilgan talablar uchun qurish. Bugungi talab faqat bitta aniq narsa bo'lsa ham, umumiy maqsadli tizim qurasiz. Umumlashmalar ikkinchi haqiqiy ehtiyoj paydo bo'lmaguncha foydasiz. Undan oldin siz shunchaki o'zingiz ixtiro qilgan kelajak uchun quryapsiz. O'zingizdan so'rang: Java yoki boshqa OOP tilida yozgan bo'lsangiz, interfeyslaring necha marta ikkinchi implementatsiyaga ega bo'lgan? Yoki ular birorta mijozni xursandroq qilmasdan, shunchaki murakkablik qo'shib, kodni o'zgartirishni qiyinlashtirganmi?
Yechim: maqsaddan boshlab, ortga yurib ishlash.
Birorta kod yozishdan oldin muammoni oddiy tilda yozing: nima buzilgan, nima uchun uni hal qilish muhim, mijozlaringiz qanday foyda oladi? Hech qanday yechim, framework tanlovisiz, faqat muammoga e'tibor qarating.
Amazonda biz maqsaddan boshlab, ortga yurib ishlash uchun (batafsil bu yerda) PR/FAQ'lar, dizayn hujjatlari va bir sahifaliklar (one-pager) yozamiz. Barchasi turli formatlarda, lekin har biri bitta narsaga, mijoz muammosiga qaratilgan.
Chunki oxir-oqibat, siz kod yozish uchun pul olmaysiz. Siz to'g'ri muammolarni hal qilish uchun pul olasiz, kod bilan yoki usiz.
@abdullaevdev
#engineering #softwareengineering #career #faang #aws
πΊπΏ
Bu sizni xafa qilishi mumkin, ammo haqiqat shuki: siz mijozlarni xursand qilish, daromadni oshirish (shu jumladan o'z maoshingizni ta'minlash) va ehtimol AWS xarajatlarini kamaytirish uchun pul olasiz. Kod esa bu natijalarga erishish yo'llaridan biri xolos.
Biroq, aksariyat muhandislar buni teskari tartibda qiladi va noto'g'ri narsani optimallashtiradi. Ko'pchilik qanday degan savolga javob qidirishni o'ylashga o'rgangan. Qanday qurish, qanday kengaytirish, qanday yozish. Lekin ba'zida to'g'ri javob umuman yangi tizim emas: bu konfiguratsiyada o'zgarishi, 15 daqiqalik uchrashuv yoki talabga shunchaki "yo'q" deyish bo'lishi mumkin.
Bu muhandislikka o'xshaydi, lekin aslida farazlarga asoslangan noto'g'ri muammoni hal qilishdir.
Muhandislarni noto'g'ri yo'nalishga tortadigan ikki tuzoq:
1-tuzoq: Texnologiyaga oshiq bo'lish. Siz Rust yoki yangi frameworkni o'rgandingiz. To'g'ri muammoni aniqlab, keyin uni hal qilish uchun bir vosita izlash o'rniga, o'z-o'zidan muammolarni sevimli vositalaringizga moslab izlay boshlaysiz. O'sha bir vosita kod bo'lishi mumkin, jadval, yoki kerak bo'lsa ketmon...
2-tuzoq: O'zingiz faraz qilgan talablar uchun qurish. Bugungi talab faqat bitta aniq narsa bo'lsa ham, umumiy maqsadli tizim qurasiz. Umumlashmalar ikkinchi haqiqiy ehtiyoj paydo bo'lmaguncha foydasiz. Undan oldin siz shunchaki o'zingiz ixtiro qilgan kelajak uchun quryapsiz. O'zingizdan so'rang: Java yoki boshqa OOP tilida yozgan bo'lsangiz, interfeyslaring necha marta ikkinchi implementatsiyaga ega bo'lgan? Yoki ular birorta mijozni xursandroq qilmasdan, shunchaki murakkablik qo'shib, kodni o'zgartirishni qiyinlashtirganmi?
Yechim: maqsaddan boshlab, ortga yurib ishlash.
Birorta kod yozishdan oldin muammoni oddiy tilda yozing: nima buzilgan, nima uchun uni hal qilish muhim, mijozlaringiz qanday foyda oladi? Hech qanday yechim, framework tanlovisiz, faqat muammoga e'tibor qarating.
Amazonda biz maqsaddan boshlab, ortga yurib ishlash uchun (batafsil bu yerda) PR/FAQ'lar, dizayn hujjatlari va bir sahifaliklar (one-pager) yozamiz. Barchasi turli formatlarda, lekin har biri bitta narsaga, mijoz muammosiga qaratilgan.
Chunki oxir-oqibat, siz kod yozish uchun pul olmaysiz. Siz to'g'ri muammolarni hal qilish uchun pul olasiz, kod bilan yoki usiz.
@abdullaevdev
#engineering #softwareengineering #career #faang #aws
π₯14β€5π―4π2
Forwarded from Laziz Abdullaev
AI uchun matematik modellashtirish nimaga kerak?
Yaqinda Kanada universitetidan tadqiqotchi ICLR 2025 konferensiyasida chop etilgan Twicing Attention haqidagi ishimni davom ettirishga urinibdi.
Yuqoridagi chizmada ko'rishingiz mumkinki, Twicing Attention 7.4 million parameter orqali 8.8 million parameterlik Standard Attention bilan deyarli bir xil natija ko'rsatgan (69.6 vs 69.0).
Twicing Attention bu - Standard Attention funksiyasini nonparametric regression orqali matematik modellashtirib, so'ng nazariy "kuchaytirilgan" statistik funksiya.
Yuqoridagi maqola avtoridan shu borasida quote:
@lazizabdullaev
Yaqinda Kanada universitetidan tadqiqotchi ICLR 2025 konferensiyasida chop etilgan Twicing Attention haqidagi ishimni davom ettirishga urinibdi.
Yuqoridagi chizmada ko'rishingiz mumkinki, Twicing Attention 7.4 million parameter orqali 8.8 million parameterlik Standard Attention bilan deyarli bir xil natija ko'rsatgan (69.6 vs 69.0).
Twicing Attention bu - Standard Attention funksiyasini nonparametric regression orqali matematik modellashtirib, so'ng nazariy "kuchaytirilgan" statistik funksiya.
Yuqoridagi maqola avtoridan shu borasida quote:
Residual correction in attention. The closest prior work is Twicing Attention [Abdullaev and Nguyen, 2025], which applies Tukeyβs twicing [Tukey, 1977] within each attention layer. Their correction smooths the residual V - AV with the same attention matrix A, yielding (2A - AΒ²)V. The theoretical justification is from nonparametric statistics: twicing reduces the bias of the NadarayaβWatson estimator [Newey et al., 2004].
@lazizabdullaev
π₯5β€4