# code
- you can and SHOULD use AI when writing code
- please avoid some of the things that AI likes to do in code:
- write long, unnecessary comments on self explanatory code
- use non-ascii characters in comments
- generate redundant (even though it may be correct) code
- write repetitive code that could be easily refactored
- reimplement functionality that can be taken from a library
- in short, PRs that are AI generated without human guidance tend to be unnecessarily long
copied from: https://claude.ai/share/e4bed98a-9049-44b3-9aee-173bba941120
When a Kafka producer sets partitions explicitly, there are several important trade-offs to consider:
- Guaranteed Message Ordering: Messages sent to the same partition are guaranteed to maintain their order. This is crucial for use cases where sequence matters, like financial transactions or event sourcing.
- Predictable Data Locality: You can ensure related messages always go to the same partition, which helps with consumer processing efficiency and stateful operations.
- Load Distribution Control: You have fine-grained control over how messages are distributed across partitions, allowing you to optimize for your specific access patterns.
- Deterministic Behavior: Your application's behavior becomes more predictable since you know exactly where each message will land.
- machine
lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
sda 8:0 0 893.8G 0 disk
└─sda1 8:1 0 893.7G 0 part /
nvme1n1 259:0 0 1.5T 0 disk
nvme0n1 259:1 0 1.5T 0 disk
├─vg_nvme-lv_1 253:0 0 89.4G 0 lvm
├─vg_nvme-lv_2 253:1 0 89.4G 0 lvm
- to enable our extension to the API when using python (boto3 or aws CLI) the following file has to be placed under:
~/.aws/models/s3/2006-03-01/(the directory should be created if it dioes not exist)- currently there is no generic solution for other client SDKs
- start a vstart cluster
- create a bucket:
aws --endpoint-url http://localhost:8000 s3 mb s3://fish
- create a log bucket:
how much time did this refasctoring work took?
_ Looking at the timestamps in our conversation, the refactoring work took approximately 45-50 minutes from start to finish.
Timeline Breakdown:
Phase 1: Analysis & Planning (5-10 minutes)
- Understanding the codebase structure
- Identifying the RGWOp class hierarchy
- Finding all canonical_name() overrides
this is a followup on: https://gist.github.com/yuvalif/d50bcc3d27121732ffbbafe7b7146112 for an RGW of version "squid" or above, where bucket notifications and topics are synced between zones
- deploy a zonegroup with 2 zones:
MON=1 OSD=1 MDS=0 MGR=0 ../src/test/rgw/test-rgw-multisite.sh 2
- export credentials:
export AWS_ACCESS_KEY_ID=1234567890
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| -- Lua script to auto-tier S3 object PUT requests | |
| -- based on this: https://ceph.io/en/news/blog/2024/auto-tiering-ceph-object-storage-part-2/ | |
| -- exit script quickly if it is not a PUT request | |
| if Request == nil or Request.RGWOp ~= "put_obj" then | |
| return | |
| end | |
| local threshold = 1024*1024 -- 1MB | |
| local debug = true |
- start a vstart cluster
- created a tenanted user:
bin/radosgw-admin user create --display-name "Ka Boom" --tenant boom --uid ka --access_key ka --secret_key boom
- create a bucket on that tenant
AWS_ACCESS_KEY_ID=ka AWS_SECRET_ACCESS_KEY=boom aws --endpoint-url http://localhost:8000 s3 mb s3://fish
- create a log bucket with no tenant
NewerOlder
