Software Engineering Track -
Levent Kantaroğlu , Jamal
Hasanov , Amulya Bhatia
Flutter Usage in Firebase Studio
Levent Kantaroğlu
Flutter Usage in Firebase Studio
●
●
●
●
Flutter, Dart, Firebase
Firebase Studio’nun Serüveni
Firebase Studio’nun Yetkinlikleri
Firebase Studio’da Flutter ile Geliştirme
Flutter Usage in Firebase Studio
Flutter Usage in Firebase Studio
Flutter UI Toolkit
Multi-Platform Fast Development Performance
Flutter Usage in Firebase Studio
Flutter Usage in Firebase Studio
Flutter Usage in Firebase Studio
Flutter Usage in Firebase Studio
firebase.studio
Keep up-to-date
Flutter Usage in Firebase Studio
Levent Kantaroğlu
- Thanks -
Questions & Answers
Running Processes - Many and
Fast
A talk on Linux, Concurrency and Web Servers
Dr. Jamal Hasanov
School of IT and Engineering
Just printing
#include <stdio.h>
int main() {
printf("Hello, World!n");
}
a function from C (libc)
#include <stdio.h>
#include <unistd.h>
int main() {
write(STDOUT_FILENO, "Hello, World!n", 14);
}
We also could use a lower-level
function for the printing
What is this then??? The number of symbols
Just printing writing
Source: https://manpages.debian.org/unstable/manpages-dev/write.2.en.html
Just printing writing
Source: https://manpages.debian.org/unstable/manpages-dev/write.2.en.html
Writing to a file
write(STDOUT_FILENO, "Hello, Worlds!n", 14);
STDOUT_FILENO defines where the output text shall be
directed to
These are the descriptors from unistd.h:
/* Standard file descriptors. */
#define STDIN_FILENO 0 /* Standard input. */
#define STDOUT_FILENO 1 /* Standard output. */
#define STDERR_FILENO 2 /* Standard error output. */
Why a file descriptor?
fd = open(…)
read(fd)
write(fd,…)
close(fd)
D
D
D
D
D
D
D
D
Calling everything a file brings an abstraction – the same approach for
different sources
Does not matter where you write to:
Disk
Network
A buffer of an embedded device
Implementation of the abstraction is done through the drivers
In Linux everything is considered as a file
A process is a file too
A process is an instance of a running
program. Every time a program is launched or a
command is executed, a new process with a
unique ID (PID) is created.
proc is a pseudo-filesystem (not stored on disk) that
provides a view into the kernel’s internal data
structures.
procfs stands for Process File System, typically
mounted at /proc.
It is a virtual filesystem - its files and directories exist
only in memory and are generated dynamically by the
kernel.
Each process has its own directory
/proc/<PID>/ containing details such as:
Process files Purpose
/proc/<PID>/cmdline Command-line arguments
/proc/<PID>/cwd Symbolic link to the current working directory
/proc/<PID>/exe Symbolic link to the executable
/proc/<PID>/fd/ Open file descriptors
/proc/<PID>/status General process info (UIDs, memory, state, etc.)
/proc/<PID>/stat and
/proc/<PID>/statm
Numeric statistics (CPU time, memory usage, etc.)
/proc/<PID>/environ Environment variables
A process is a file too
Let’s see the files of this process!
Abstraction for communication
A pseudo-terminal, often abbreviated as PTY, is a virtual
device in Unix-like operating systems, including Linux.
It functions as a pair of virtual character devices that
establish a bidirectional communication channel
between processes.
PTY
(dev/pts/
1)
Master device (ptm or ptmx) is controlled by a
terminal emulator process or remote login server.
Anything written to the master is given to the slave
as input.
Slave device (pts) acts exactly like a traditional
hardware terminal (tty) device to the programs
running within it. Programs write their output to
the slave, which is then read by the master
process.
The SSH daemon process on the
remote server manages the
connection. It forwards the client
keystrokes received over the
network to the pty master and
reads output from the master to
send it back to the client.
The shell that starts on the remote
server and runs clients commands. It
receives input from the pty slave,
which is being fed by the SSH
daemon.
How to make two computers
talk?
Image source iStock and source
Internet as a file
Server Client
Being blocked
Read on a descriptor blocks if there’s no data
available.
fd = open(…)
read(fd)
Did somebody press a key?
Did somebody move the mouse?
Did somebody send a message over network?
The same is true for write.
Disk files are exception, since writes to disk
happen via the kernel buffer cache, not
directly.
Being blocked
fd = open(…)
write(fd,…)
Buffer
cache
The only time when writes to disk happen synchronously
is when the O_SYNC flag was specified when opening
the disk file.
Non-blocking file descriptors
A descriptor can be put in the nonblocking mode by
setting the O_NONBLOCK flag
In this case, a call on that descriptor will return
immediately, even if that request can’t be immediately
completed. The return value can be either of the
following:
an error: when the operation cannot be completed at all
a partial count: when the input or output operation can be
partially completed
the entire result: when the I/O operation could be fully
completed
Running more than we can
Context switching is the
process where the CPU saves
the state of a running task and
loads the state of another task
to switch from one to another,
enabling multitasking by
giving the illusion that
multiple processes are
running simultaneously.
Concurrency
Concurrency refers to the ability of a system to
execute multiple tasks through simultaneous
execution or time-sharing (context switching)
Improved performance by executing multiple tasks in parallel.
Better resource utilization, e.g., CPU and I/O devices are kept busy.
Scalability: can handle more clients, requests, or tasks simultaneously.
Modularity: concurrent tasks can be designed as independent components.
Fault isolation: failures in one task may not crash the entire system.
Motivations for concurrent applications:
Multi-threaded vs Multi-process
T1 T2 T3 T4
Process
Threads
MEMORY
Process
MEMORY
Process
MEMORY
Process
MEMORY
Process
MEMORY
Threads and the main process
use the same memory space
Each process uses its own
memory space
Combined model
Image credit: M. van Steen and A.S. Tanenbaum, Distributed Systems, 4th ed., distributed-systems.net, 2023.
Implementation options
Image credit: medium post
Mx1 - kernel is not aware of any of the user threads - there is only one
thread/process in kernel which is serving the user-space scheduler.
1x1 – each user thread is mapped to one kernel thread
MxN – a combination of Mx1 and 1x1: a kernel thread may serve an individual
user kernel or the scheduler
My first discovery of
Concurrency
Case 1: Advanced Payment System
Case 2: Billing System for Azeronline
“Many years later, as he faced the firing squad, Colonel
Aureliano Buendía was to remember that distant
afternoon when his father took him to discover ice”
Gabriel García Márquez
One Hundred Years of Solitude
Advanced Payment System
050123456 Search
Search number
Name & Surname
Make payments
Operation result
Advanced Payment System
Your
number
please
Your
number
please
111 222
Are you
John Smith?
Are you
Bob Adams?
Sorry,
I am Bob
Sorry,
I am John
Concurrency problems
Competition vs Cooperation
Race Condition
A resource
task task task task
update
A resource
task task task task
Concurrency solutions
A resource
A task
Mutex
A resource
A task
Semaphore
notify
A resource
A task
Monitor
Monitor
A resource
A task
Non-blocking
Concurrency Demo: one processor involved
Execution of the code
Calc in Loop 1
Calc in Loop 2
Calc in Loop 3
Calc in Loop 4
Calc in Loop 5
Calc in Loop 6
Calc in Loop 7
Calc in Loop 8
Calculate total
Execution of the code
Loop
1
Loop
2
Loop
3
Loop
4
Calculate total
Concurrency Demo: multiple processors involved
Loop
5
Loop
6
Loop
7
Loop
8
A visual demo – Image
Quantization
A practical application – handling client requests
SERVER
But we can
serve 6 of them
at a time!
A concurrency example
Apache Web Server
Control Process (Parent)
Child Process Child Process Child Process
Listener Threads Server Threads
Web Clients
Problems
Threads are generally lighter than
processes, a large number of
concurrent threads can still
consume significant memory and
CPU resources, potentially
leading to performance
degradation under heavy load or
with a high number of idle
connections.
Thread-per-connection
model can encounter
scalability limitations,
especially when handling
a massive number of
concurrent, long-lived
connections
Creating, managing, and
destroying a large number of
threads can introduce overhead,
especially under fluctuating or
bursty workloads, potentially
impacting overall performance.
C10k problem
Stated by software engineer
Dan Kegel in 1999
Stands for Concurrent 10,000
requests
The C10k problem was the
challenge of designing network
servers that could handle
10,000 concurrent client
connections
Nginx story
2002 – Russian software engineer Igor
Sysoev began developing Nginx to solve the
“C10K problem” — handling 10,000
simultaneous connections efficiently.
2004 – Nginx was publicly released as
open-source software under the BSD-like
license.
2006–2010 – It gained popularity for
its event-driven, asynchronous
architecture, outperforming Apache in
serving static content and handling high
concurrency.
2017 – Surpassed Apache in market share
for top 1000 busiest websites, marking a
significant shift in web infrastructure.
Clients
Master
Workers
CPU
Responsible for reading configuration,
creating sockets, managing signals, and
spawning workers.
Does not handle client I/O.
Each worker runs an independent
event loop
Accepts client connections and
handles requests
Each worker is single-threaded, non-
blocking, and uses I/O multiplexing
for thousands of concurrent
connections.
T1 T2 T3 T4
Process
Threads
MEMORY
Process
MEMORY
Process
MEMORY
Process
MEMORY
Process
MEMORY
Resources are naturally shared
(in the same codebase) among
the threads and process
No common memory – none of
the resources of one process is
visible outside
How to share the user requests,
sockets and other resources?
Process
MEMORY
Process
MEMORY
Process
MEMORY
Process
MEMORY
OS Kernel
socket
The descriptor of
this socket (a file) is
shared among
processes
A processes use
OS to check is
there is anything
in the socket
A socket is listening
for the requests.
I/O Multiplexing
We showed how a process handles
I/O on a single descriptor.
Often, a process might want to
handle I/O on more than one
descriptor.
I/O multiplexing is a technique
that allows a single thread to
monitor and manage multiple I/O
channels, such as network
sockets, to handle many
connections efficiently
Blocking is not a solution for sure – we cannot keep
waiting!
Non-blocking approach might not be the right approach:
When data is coming in very slowly the program will wake up
frequently and unnecessarily, which wastes CPU resources.
When data does come in, the program may not read it immediately
if it's sleeping, so the latency of the program will be poor.
Handling a large number of file descriptors with this pattern would
become cumbersome.
Solution: I/O multiplexing modes need to be considered
Problem: Requesting
multiple sources in an
asynchronous mode
I/O Multiplexing
There are several ways of multiplexing I/O on
descriptors:
non-blocking I/O - the descriptor itself is marked as
non-blocking, operations may finish partially
signal driven I/O - the process owning the
descriptor is notified when the I/O state of the
descriptor changes
polling I/O - with select or poll system calls, both of
which provide level triggered notifications about the
readiness of descriptors
I/O Multiplexing
non-blocking I/O
If a process tries to perform I/O operations
very frequently, the process has to
continuously be retrying operations that
returned an error to check if any descriptors
are ready.
Such busy-waiting in a tight loop could lead
to burning CPU cycles.
All file descriptors are set to non-blocking mode
I/O Multiplexing
Signals are expensive to catch, rendering signal driven I/O
impractical for cases where a large amount of I/O is
performed.
Kernel tracks a list of descriptors and sends the process a
signal every time any of the descriptors become ready for
I/O.
signal driven I/O
The kernel is instructed to send the process a signal when I/O
can be performed on any of the descriptors.
I/O Multiplexing
polling I/O
The process uses the level triggered mechanism to request
the kernel by a system call which descriptors are capable of
performing I/O.
There's a few I/O multiplexing system calls: select (defined by POSIX),
the epoll on Linux, and the kqueue on BSD.
These all work fundamentally the same way: they let the kernel know
what events (typically read events and write events) are of interest on
a set of file descriptors, and then they block until something of interest
happens.
epoll
epoll is a Linux kernel system
call for an I/O event notification
mechanism
epoll monitors multiple file
descriptors to see whether I/O is
possible on any of them.
epoll uses a red–black tree (RB-
tree) data structure to keep
track of all file descriptors that
are currently being monitored
Edge-Triggered Polling (ET)
Events are delivered only when the state transitions (from not-ready →
ready).
Example: you get a read event only when new data arrives, not while data remains
unconsumed.
Requires reading/writing until EAGAIN to avoid missing future events.
More efficient because it avoids repeated notifications.
More complex to implement correctly - mismanagement may lead to
stalled I/O.
Level-Triggered Polling (LT)
The default behavior in many polling APIs.
A file descriptor is reported as “ready” as long as the condition remains
true
(e.g., there is unread data in the buffer).
The application may receive repeated readiness notifications until the
condition is cleared.
Safer and simpler but may cause unnecessary repeated wake-ups.
Nginx - Full Scenario
The master reads nginx.conf, creates
listening sockets, binds to ports (e.g., 80,
443), and sets them as non-blocking.
Then it forks the worker processes.
Each worker inherits the listening socket file
descriptors from the master.
During its startup routine each worker:
Initializes its event loop subsystem.
Calls epoll_create().
Stores the returned epoll file
descriptor for future event registration.
Context Switch and Pinning
Context Switching in Multiprocessing
The CPU rapidly switches between processes,
pausing one and resuming another so they can
“share” the same core.
Each switch has a cost: the CPU must save the
current state, load another one, and refill caches,
which can slow systems down under heavy load.
Nginx ties each worker process to a specific CPU
core, avoiding unnecessary movement between
cores.
This improves performance and lowers latency,
because the worker keeps its core’s warm caches
and avoids extra context-switch overhead.
Customer Service Analogy - Apache
Only 4 computers available
Hired 25 people to handle customer requests
Customer Service Analogy - NGINX
Only 4 computers available
Only 4 employees – each with a dedicated PC
Example: one customer may have 10-
second work, while other may require
an approve from the head office or
credit system.
pick the next task when they have a
time (a previous task is waiting or
something)
Another asynchronous event loop model – Node.js
Concurrency in Functional
Programming languages
Immutability simplifies concurrency
No shared mutable state: fewer race
conditions than in imperative code.
Pure functions parallelize naturally
No side effects: tasks run safely on different
cores without coordination.
Actor-based concurrency
Many FP languages use message-passing
(Erlang, Elixir, Scala/Akka) instead of shared
memory.
Higher-level concurrency primitives
Futures, promises, parallel map, and lazy
streams reduce the need for locks.
Icon source
Recommendations
Learn Linux
It is a monolithic OS –
check for the alternatives
What will be the OSs in
quantum computers?
1
Learn Concurrency
programming
2
Learn Low-level
programming
You cannot get ready for
Quantum if you don’t
understand hardware well
3
Questions?
References
What is epoll? A Medium post
Non-blocking I/O – A Medium post
Blocking I/O, Nonblocking I/O, And Epoll (link)
1
Hey all,
I’m Amulya Bhatia!
Chief Architect Gen AI SME GDE
They/Them
My ADK Book LinkedIn
Firebase Studio is an
integrated and extensible
agentic workspace to
build, run, and manage
web apps, cross-platform
mobile apps, backend
services, and more.
Web-based
Full cloud VM
Deep Google
Integrations
Preconfigured
Environments Live preview
Designed for
collaboration
Workspace sharing available
with more features planned.
AI Assistance
Built on VS Code
Yourworkspaceisa URL,
installable as a PWA.
Supports most toolchains.
Startfromcommon starting
points or customize your own.
Streamline app dev workflows.
Across code, test,
debugging, etc.
Realdevice,emulator,
or hosted IFRAME.
World-class code editing.
super_g_logo
supe
www.firebasestudio.com
Multimodal /
Natural Language
www.firebasestudio.com
www.firebasestudio.com
Create a
blank project
www.firebasestudio.com
Templates
Open an
Existing Project
www.firebasestudio.com
Agentic
Experiences
Throughout
Our advanced coding capabilities
are enhanced with agentic
experiences that take complex (or
boring) actions on behalf of you.
Whether the changes need to
happen across a section of code, a
single file, or an entire code base,
Gemini will understand your intent
and accomplish the task.
AI-centric View Code-centric View
We want to ensure you all have a choice
in using as-much or as-little AI as you want
when building your apps
Share and
Collaborate
in Real-Time
Not only can you share thedeployed
link, you can share the entire
workspace with a URL.
This means you can collaborate in
real-time within the same Firebase
Studio environment, and then push
updates instantly.
Firebase Studio in
Action
App Prototyping Agent
To build our full stackweb
application, we can startwith
a natural language prompt and
it will create a PRD forusto
review.
After we modify the proposal
as needed we can generate the
app and iterate in chat to
update and rebuild the
application.
We can prototype quickly to get a
Next.js app with Genkit for agentive
features that is connected to Firebase!
Quickly Deploy to Firebase
We can quickly deploy our
application we created to
Firebase App Hosting.
The wizard will guide you on
picking the correct project, billing
account and kick of the new or
updated deployment.
Firebase Studio supports existing
codebases and any stack that canbe
installed with Nix (120k+ packages)
Exploring the Code
Every workspace is backed by
full IDE that we can edit the
generated code.
a
There is a full VM backing
Firebase Studio so you can run
commands that would usually
fail in the browser.
Editing the code with Gemini
In the IDE Code view of
Firebase Studio we can still
have the full power of Gemini
in the workspace.
Gemini can read and write
files and run terminal
commands with the full context
of the project, recently
opened files and any
attachments sent.
PropProieptarireyt a+r Cy oandfidCeo
a lightweight, powerful, and accessible CLI
tool that integrates cutting-edge AI
directly into the terminal.
Proprietary + Confidential
Gemini CLI
Code and Files Invoke tools Coordinate with
other apps
Coordinate with other
applications like VS Code
or Chrome, performing
actions or gathering
context
Comprehensive
Context
Including project files,
data, and potentially even
screen sharing – to
provide the most relevant
and effective assistance
Generate code and
manage files (the core CLI
function)
Intelligently invoke other
developer tools, MCP
support, to manage local
development, run tests,
interact with cloud
services, etc
Proprietary + Confidential
Gemini CLI Architecture
User via Terminal packages/cli packages/core
Tools / MCP Servers
Gemini/Code Assist API
User sends
tasks/prompts via
terminal and receives
outputs and requests
(to perform actions via
tools)
Sends user
requests/confirmations
to the core and receives
tools details and final
outputs from the core
Sends prompts/tools
info to Gemini
Executes Tools and
receives results
Local Tools (file, shell, web)
Local and Remote MCP
Servers
Proprietary + Confidential
Interactive and File/Shell
Integration
/ commands
@ context
! shell
Proprietary + Confidential
Powerful Built-In Tools
File system tools (read_file, write_file,
list_directory, search_file_content),
Shell tool (run_shell_command)
Web tools (web_fetch,
google_web_search)
Proprietary + Confidential
Hierarchical Context &
Memor y
Project-specificandglobal instructions
to the AI using GEMINI.md files
Proprietary + Confidential
Secure Sandboxing
Run in isolated environment using
technologies like Docker, Podman,
or macOS's native Seatbelt
functionality
Proprietary + Confidential
Custom Commands and
MCP
Createcustom slashcommandsfor
frequently used prompts.
Model Context Protocol (MCP) servers
32
Gen AI Toolbox benefits
Bettermanageability,observability,andsecurityfor your gen AI agents
Reduced boilerplate code to simplify tool development
Efficient connection pooling and optimized connectors for databases
Configuration-driven approach enables deployment without interruption
Zero downtime
Enhanced security
Better performance
Simplified development
End-to-end observability Integrated with Google Cloud monitoring and tracing
Provides simple patterns for integrating user authentication
Toolbox flow
1 Access Database
� �
Specify URI
2
Define tools
Load Tools
3
Invoke Tool
tools.yaml
● Configuration (tools.yaml)
○
○
○
○
Usersdefineseveral resources in a
file
3 sections: Sources, Tools, Toolset
Each is defined in a map of name →
object definition
Toolbox loads them on start up, builds
appropriate APIs
Source Tools Toolset
Cloud
SQL
AlloyDB
Postgres
interacts with
one of
postgres-sql
( user
defined )
Define sources
sources:
# This tool kind has some requirements.
# See
https://github.com/googleapis/genai-toolbox/blob/main/docs/sources/cloud-sql-pg.md#require
ments
my-cloud-sql-source:
kind: cloud-sql-postgres
project: my-project-name
region: us-central1
instance: my-instance-name
user: my-user password:
my-password database:
my_db
Sources represent a source of data that a tool can use. Typically they encapsulate _how_ a
database is connected to – IP address, credentials, etc.
Define Tools
tools:
get_flight_by_id:
kind: postgres-sql
source: my-cloud-sql-source
description: >
Use this tool to list all airports matching search criteria. Takes
at least one of country, city, name, or all and returns all matching
airports. The agent can decide to return the results directly to
the user.
statement: "SELECT * FROM flights WHERE id = $1"
parameters:
- name: id
type: int
description: 'id' represents the unique ID for each flight.
Tools are an action, typically executed on a source. These include description of how/when to
take the action, and things like which parameters are specified.
Define Toolsets
toolsets:
my_first_toolset:
- my_first_tool
- my_second_tool
my_second_toolset:
- my_second_tool
- my_third_tool
Tool sets are logical groups of tools. You can load all tools or tools by toolset to pass into an
agent.
# This will load all tools
all_tools = await client.load_toolset()
# This will only load the tools listed in 'my_second_toolset'
my_second_toolset = await client.load_toolset("my_second_toolset")
Demo
38
AI TRACK - Mirakram Aghalarov &
Zahra Bayramli, Nishi Ajmera,
Kamran Huseynov & Tarlan
Huseynov,Nikhilesh Tayal , Yoyu Li
XXXXX
XXXXX
XXXXX
XXX
XX
XX
XX
XX
XX
XX
XX
XX
XX
XX
XXXXX
XXXXX
XXXXX
XXXXX
XXXXX
X
WHEN LLM STAYS HOME:
BUILDING KNOWLEDGE
SYSTEMS ON PREMISES
Mirakram Aghalarov
Zahra Bayramli
</
/>
</SPEAKERS
Mirakram Aghalarov Zahra Bayramli
Senior Deep Learning Engineer @SOCAR
CIC
Lecturer of AI&ML courses at BHOS
MSc Data Science and Engineering Graduate
from Politecnico Di Torino
Deep Learning Engineer @SOCAR CIC
BSc Computer Science Graduate from
Korean Advanced Institute of Science &
Technology (KAIST)
//Table of contents
{02}
{01}
{03}
Cloud Infrastructure
On-Premise Infrastructure
Hybrid Solution
</Cloud Infrastructure
01
</AI Start Ups Worldwide
2024 2025
~92000 ~212000
Delivery Hero SE is a German multinational online food ordering
and food delivery company based in Berlin, Germany.
REI is a member-owned retailer that sells outdoor gear,
promotes sustainable adventure and customer specific design
using GenAI.
</Distribution over Cloud
The AI developer platform to build AI agents, applications,
and models with confidence
Accelerating AI development and deployment with a secure
collaboration platform for AI developers and data providers
</Distribution over Cloud
Deepset is an enterprise software vendor that provides
developers with the tools to build production-ready Artificial
Intelligence and natural language processing systems.
Robin AI is a legal-tech company that provides an AI-powered
platform to review, analyze, and manage contracts far more
quickly and securely.
</Distribution over Cloud
</How to build simple Chatbot
Vertex AI Studio
Vertex AI Search
Agent Garden
Ray
</How to build simple Chatbot
</How to build simple Chatbot
GCP provides easy to use pipeline
builder to establish the flow.
To build simple RAG agent Vertex
studio helps to integrate the chatbot
with Vertex AI Search capabilities
based on selected Vector Database
</How to build simple Chatbot
AI Document Intelligence
AI Content Understanding
OpenAI Model Registry
Azure Machine Learning
</How to build simple Chatbot
</How to build simple Chatbot
Amazon Sagemaker
Amazon Bedrock
Amazon Lambda
AWS S3
Amazon Kendra
Amazon API Gateway
</How to build simple Chatbot
Amazon Sagemaker Amazon Bedrock
Amazon Lambda AWS S3
Amazon Kendra
Amazon API Gateway
</Agentic Cloud Tools
01
</Agentic AI
LLM
</Agentic AI
LLM
</Agentic AI
My dear, Tell me about weather in our 4th plant
What a marvellous question! I am going to check
what is the location of “our 4th plant” …
Okay! Now location of plant with ID 4 is known.
SQL call to the database
</Agentic AI
My dear, Tell me about weather in our 4th plant
What a marvellous question! I am going to check
what is the location of “our 4th plant” …
Okay! Now location of plant with ID 4 is known.
SQL call to the database
Now lets look at the search on services about the
weather information
Answer is 26 Degree Celcius
Searching through the Web
</Model Context Protocol
</ReAct Agent
Query
Answer
Tools
Thought
</Cloud Evaluation
Very Low Downtime
High Scalability
Mature Solutions
Low Code Environment
Faster Deployment
No CAPEX
Higher OPEX
Network Bandwidth Limitation
Uncontrollable Cost
Not Suitable for Real time
Limited Privacy
</AWS Spread
Limited Privacy
</Azerbaijan Situation
Limited Privacy
Azerbaijani Legislation does not
allow private information to be
used in cloud infrastructure due
to location of them.
Cyber threats, privacy insurance
does not let any sensitive
information to be sent to outside
border of Azerbaijan
</Azerbaijan Situation
Limited Privacy
Higher time for MVP
Higher requirements to
accomplish the task
Less number of startups
</On-prem
Infrastructure
02
Data Privacy and Compliance
- Sensitive documents, internal systems
</Need for On-prem
Restricted Environments
- Factories, labs, government facilities
with no external internet access
Cost
- Cloud APIs become expensive with higher
usage
Full Customization and Control
- Ability to retrain, fine-tune, or modify
models without vendor limits
</RAG Architecture Example
Document Ingestion Embedding Generation Vector Database
Vector Representation
Text/Table
Extraction
Chunking
Embedding Model
Stores embeddings and allow fast
similarity search
0.2 0.7 0.5 …
…
…
Optical Character
Recognition
</RAG Architecture Example
Retriever LLM (on-prem) Application Layer
Vector Search
Hybrid Search
Reranker
Prompt + Retrieved Chunks UI/Chat Interface
FastAPI/Tools
Output
</On-premise Infrastructure
OCR
DB
LLM
Tools
</On-premise Infrastructure
OCR
DB
LLM
Tools
</On-premise Infrastructure
OCR
DB
LLM
Tools
GPU
Data
Center
</On-premise Infrastructure
OCR
DB
LLM
Tools
GPU
Data
Center
Some Cloud
in
Azerbaijan
</On-premise Infrastructure
OCR
DB
LLM
Tools
GPU
Data
Center
Some Cloud
in
Azerbaijan
</Challenges with Open-Source OCR
Complex Tables
Merged cells, nested tables
VLMs Offline
Large, slow, proprietary
Handwritten documents
Cloud APIs > open source
Azure AI Document
Intelligence
Google Vision API
- High VRAM Requirements
- No-fine tuned OCR pipelines
- Latency too high without GPU clusters
</Vector Database
Core Components of
On-Prem Retrieval
Local Vector Database
</Vector Database
Core Components of
On-Prem Retrieval
Local Vector Database Embedding Model
bge-m3 E5-large
Nomic-embed
Jina v2
MiniLM
Instructor-xl
</Vector Database
Core Components of
On-Prem Retrieval
Local Vector Database Embedding Model
Reranker
bge-m3 E5-large
Nomic-embed
Jina v2
MiniLM
Instructor-xl
bge-reranker-large
colbert
</Vector Database
Capabilities Missing
Hybrid Search
- Cloud versions tune this automatically; on-prem
requires manual scoring & fusion
Premium Features
- E.g. Reciprocal Rank Fusion (RRF) in ES not
available on-premise; requires custom implementation
Enterprise-Grade Monitoring & Analysis
- Cloud dashboards show slow queries, index health,
drift detection etc.
</Challenges with Open-Source Models
Sensitivity to Noisy Input
- DeepSeek/Qwen degrade more on messy documents Mtng w/ team @ 4pm
pls updte repprt
Q3 target = 1O0K ???
ask Sam re: buget
Meeting with the team at 4pm. Update the report. Q3
target is 100K. Ask Sam about the budget
The meeting is about 4pm and a report, possibly
about budget. Not sure what target means.
Llama Qwen
</Challenges with Open-Source Models
Sensitivity to Noisy Input
- DeepSeek/Qwen degrade more on messy documents
Context Size Expansion
- Big documents ⇨ huge prompts ⇨ slower inference
Quality & Alignment
- More hallucination, low quality answers
</Agentic AI On-Prem
Model Context Protocol (MCP)
- On-prem LLMs often not aligned for tool calls
MCP HOST
MCP Server A
MCP Server B
MCP Server C
MCP Protocol
MCP Protocol
MCP Protocol
Local Files Storage
External APIs & Apps
Remote Database
</Agentic AI On-Prem
Unreliable Function Calling
- No built-in “tool calling” alignment like ChatGPT or Gemini
ReACT Agents do not work
- Lose state across steps ⇨ repeat wrong calls. Produce inconsistent
“thought/action” formatting. Hallucinate tool names or steps
</Agentic AI On-Prem
Knowledge Graph Integration
Entity Extraction
Database
Graph DB
Entities and
relationships
Agent
Graph Query Generation Output
LLM Result
User Query
Graph DB
</Agentic AI On-Prem
Knowledge Graph Integration
- LLM extracts structured
entities and relations using
long rule-based prompts
- Recursive multi-pass
extraction across chunks
</GPU infra: Inference & Deployment
TensorRT-LLM
- Provides 2-4x faster throughput compared raw pytorch
- Provides better parallelism
- Supports FP8/INT4 quantization to reduce memory usage
- Requires custom engine building based on GPU model
</GPU infra: Inference & Deployment
vLLM / TGI
- High-throughput distributed serving
- Good batching & streaming performance
- Supports HF models & OpenAPI standards
- Requires full GPU stack
Ollama / llama.cpp
- Easy setup, simple single-node runtime
- Supports GGUF quantized models, small memory footprint
- Limited scaling & lower token throughput
</On Premise Evaluation
High Privacy Limited Scalability
High CAPEX
High Downtime
Higher Development Time
Faster Communication
Real-time Streaming
Full-control over Cost
More Customization
</Best of 2 Worlds
High Privacy
Faster Communication
Real-time Streaming
Full-control over Cost
More Customization
Very Low Downtime
High Scalability
Mature Solutions
Low Code Environment
Faster Deployment
Hybrid is Possible Choice
Video Stream
Text Stream
Data Stream
Clean my code! Now!
Blurring
Masking
Anonimization
Hybrid is Possible Choice
Blurring
Masking
Anonimization
LLM
Cloud
Tool 1 Tool 2 Tool 3
Demasking
Hybrid is Possible Choice
Tool 3
Demasking LLM Answer
//Conclusion
It is not easy to keep the pace of the development in Azerbaijan
compared to the Worldwide scale. Only way out is to understand the
reasons behind limitations and find bypass to eliminate the
disadvantages of both sides
CREDITS: This presentation
template was created by
Slidesgo, and includes icons
by Flaticon, and infographics
& images by Freepik
//Thanks,
Do you have any questions?
Using MCP, A2A and ADK
Building
Conversational
Agents
Nishi Ajmera
Solutions Architect
Publicis Sapient
A software entity designed to act
autonomously to achieve specific goals
Performs tasks, interacts with users, utilizes
external tools
Goes beyond simple input/output – they can
reason, plan, and orchestrate
What is an AI Agent ?
AI systems that use natural language to interact with users and
complete tasks
What are Conversational Agents ?
User
Coding Assistant
Assistant (Siri, Alexa)
Support Bot
Modularity: Break down complex problems into
smaller, manageable agent tasks
Specialization: Create expert agents for specific
functions (e.g., a "billing agent," a "research agent")
Collaboration: Agents can work together, delegating
tasks and sharing information
Scalability & Maintainability: Easier to update, debug,
and scale individual components.
From simple tasks to complex workflows
Agentic Architectures
Agent Development Kit
(ADK)
A flexible and modular framework for developing and
deploying AI agents.
Key Goals
Make agent development feel like software
development.
Simplify creation, deployment, and
orchestration.
Core Principles
Model-agnostic (Optimized for Gemini, but
supports others via LiteLLM).
Deployment-agnostic (Local, Cloud Run, Agent
Engine).
Compatibility with other frameworks (e.g.,
LangChain, CrewAI).
Agent Development Kit
Model Context Protocol
MCP standardises the way AI models and tools communicate and
share context
Why MCP ?
Why MCP ?
Agent to Agent Protocol
Enables seamless communication and coordination between
multiple AI agents
Main Actors in A2A
How does A2A work ?
Agent to Agent & MCP
MCP vs A2A
Aspect
MCP (Model Context
Protocol)
A2A (Agent2Agent Protocl)
Purpose Agent ↔ Tools/Resources Agent ↔ Agent Collaboration
Communication Client-Server (function-like) Peer-to-Peer (conversational)
State Stateless (tools as functions) Stateful (task lifecycle)
Best For
Accessing tools, APIs,
databases
Multi-agent coordination
Agent as a Tool ?
When to Use Agent-as-Tool (MCP):
Single orchestrator architecture
Short-to-medium tasks (minutes to ~1 hour)
Need tight control over workflow
Simple request-response patterns
You need deterministic, structured
interactions
When to Use A2A:
Long-running tasks (hours to days)
Peer collaboration and negotiation
Dynamic agent discovery
Multi-vendor ecosystems
MCP ensures agents have the right
context and tools to operate
efficiently, while A2A enables
seamless collaboration. Together,
they create a powerful,
interoperable AI ecosystem
Questions ?
Thank You
Agentic Workflows with AWS
DevFest 2025
Tarlan Huseynov
DevOps Engineer
AWS Community
Kamran Huseynov
AI Engineer
PHASE1: Chunking & Embedding &
Storing
PHASE 2: Semantic Search / Embedding
Inference + Similarity Search
PHASE 3: Augmented Generation & Response
Semantic Search and Retrieval in RAG Pipeline: embed → search → retrieve →
generate
Amazon Bedrock Knowledge
Base
Strands Agents Solution
Strands Agents – AWS-backed
Production-ready agent workflows,
model driven & composable &
emphasizes simplicity
MCP – Modular tool servers,
scalable & observable
Self-Managed provisioning for
logical stack (ECS + Fargate)
Building With Strands Agents
GitHub
LinkedIn Meetup WhatsApp
Production
ready AI Agents
AI Agents
LLMs
+ reasoning,
+ external applications,
+ self-reflectioncapabilities
Do you
need an
AI Agent ?
Give me 10 ideas for my Twitter post on AI
Prepare a report of top AI research papers
Send Sachin a leave request and update my
calendar accordingly.
Book the cheapest flight from Delhi to Dubai
Translate this paragraph from Hindi to English
Write an email requesting leave in a polite tone
AI Search
Agent
How Agent can access
tool
Without AI Agent
AI Agent with Google Search
How the model was thinking
Memory
Memory is
important for
conversations
AI Agents actually don’t have any memory
Memory has to be handled outside LLM
AI Agent
Memory
Memory is important for
conversations
AI Agent without memory
AI Agent without memory
Adding a conversation memory
Adding a conversation memory
AI Agent with a conversation memory
Adding a conversation memory
Adding a conversation memory
Self
managing
memory
AI Agent with self-
managing memory
AI Agent conversation
AI Agent conversation
Making memory editable
AI Agent with self managing memory
Tool for memory management
Defining Agent’s memory
Managing memory on its own
There are different kinds of “Memory”
AI Agent
Evaluation
How to evaluate the
chaos?
Can you identify, what went wrong here?
What could go wrong with AI Agents
What could go wrong with AI Agents
Evaluation –Traditional Software vs AI Agents
Evaluation –Traditional Software vs AI Agents
Suite of LLM Evaluation Methods
Suite of LLM Evaluation Methods
AI Agent evaluation
Evaluation
Techniques
-Code based evals
-LLM as a judge
-Human annotations
Code based evals
Re g ex
JSON Parsable
Code based evals
LLM as a judge
LLM as a judge –Important considerations
Annotations
Elements to
evaluate
-Tool choice
- Generation
-Path choice
If your Agent’s output is correct,
does the trajectory matter?
ADK
Open foundation for
making
production-ready
Google ADK –available as a Python SDK
Open Foundation
Agent Development Kit
Google ADK –Agent Development Kit
And one command to debug
Happy to connect on LinkedIn
-Founder at “AI ML etc.”
-AI Instructor at LinkedIn Learning
-Google Developer Expert for AI
-IIT Kharagpur alumnus
Thanks a lot!
Creative Technology Director, Infinite Whys
Baku
Beyond the Prompt
AnAnatomy ofan AI-Powered
GameUsing Agent Development
Kit
Yoyu Li
2025
What we are going to talk about
●
●
●
●
●
what ADK is, and why
(spoiler: Agent Development Kit)
a game demo
a simple agent
the multi-agent pattern
a custom agent
01
what is ADK*?
and why?
*Agent Development Kit
LLM + Prompt LLM + Retrieval LLM + Retrieval + Tools
+ Many Tools
+ Reasoning Loop
Agent
Multi Agent
Systems
LLM LLM RAG LLM RAG Tools
LLM
Tools
RAG
An Evolution
AI Agents
Tools
Agent runtime
Orchestration
Agent Definitions
models
Model(s)
Orchestration
Execute the stepsof the LLM
Functions
APIs
Databases
Query
Model-based
reasoning/planning and task
execution loop
Profile, goals, instructions, tools, …
Response
derived plan, to accomplish
given tasks. This includes tool
invocations and maintenance
of intermediate state.
Usedtoreason over goals,
determine the plan and
generate a response
Tools
Fetch data, perform actions or
transactions by calling other
APIs or services
Memory
Short-term Long-term
(AnAgent can
usemultiple models)
GenerativeAI
Keycomponents
AI Agents reason,plan,and
executetasks for users
End user
Baku
Is there something more
?
personal
ADK helps us build a good
architecture for AI-powered
applications. Having specialised
agents means we can
potentially use smaller/more
efficient AI, or even no AI. And
we should onlyuse Gen AI
when it makes sense to do so.
02
Demo time
in Python
A simplegamebuilt
(this is for demo only, therefore not a very sophisticated application)
03
An anatomy
What is under the hood
question agents folder
validation agent folder
file structure
main.py
ui_components.py
game_controller.py
.env file
adk_runners.py
test_runners.py
agent.py
agent.py
display layer
control layer
ADK runners
& sessions
Simple Agent
Multi Agents
Git repo: https://github.com/yoyu777/adk-game-demo-public
04
a simple LLM agent
to validate the user
input
AI validation agent
create
the Agent
creating the agent using
ADK command line interface
> adk create [agent_name]
root_agent =LlmAgent(
from pydanticimportBaseModel, Field
from google.adk.agents import LlmAgent
class ValidationOutput(BaseModel):
is_valid: bool=Field(..., description="Indicates if the input is a valid object for the game.")
reason: str=Field(..., description="Explanation of why the input is valid or not.")
model="gemini-2.5-flash", # Or your preferred Gemini model instruction="You are incharge of validate user's
initial input", description="""The user is going to play a game of 20 Questions. Before starting the game,
you need to validate the user's input to ensure it is a valid object for the game. A valid object should be a
noun like the name of an object, an animal, or a concept, and not too obscure. For example, "cat", "car",
"apple" are valid, but "quantum entanglement" or "the number seven" are not. """,
output_schema=ValidationOutput
)
name="validation_agent",
https://github.com/yoyu777/adk-game-demo-public/blob/main/validation_agent/agent.py
debug
the Agent
debugging the agent
in browser
debugging the agent using
ADK command line interface
> adk web
> adk run [agent_name]
question agents folder
validation agent folder
Call the agent in the game
main.py
ui_components.py
game_controller.py
.env file
adk_runners.py
test_runners.py
agent.py
agent.py
display layer
control layer
ADK runners
& sessions
Simple Agent
Multi Agents
Runners, Sessions & Agents
Read more about ADK Runtime: https://google.github.io/adk-docs/runtime/
self.validation_agent_runner=Runner(
agent=validation_agent,
session_service=session_service,
app_name=app_name
)
logger.info("Validation agent initialized")
return
async def initialise_validation_agent(self):
session_service =InMemorySessionService()
app_name="ValidationAgent"
self.validation_agent_session=await session_service.create_session(
app_name=app_name,
user_id="test_user"
)
https://github.com/yoyu777/adk-game-demo-public/blob/main/adk_runners.py
if(event.is_final_response()):
logger.debug(event.content.parts[0].text)
return json.loads(event.content.parts[0].text)
else:
pass
except Exception as e:
logger.error(f"Error during validation: {e}")
return None
async def validate_input(self,user_input="elephant"):
try:
asyncforevent inself.validation_agent_runner.run_async(
user_id="test_user",
session_id=self.validation_agent_session.id,
new_message=types.Content(role='user', parts=[types.Part(text=user_input)])
):
https://github.com/yoyu777/adk-game-demo-public/blob/main/adk_runners.py
05
the multi-agent
pattern
the guessing agents
LLM agent
guess agent
custom agent
orchestrator
LLM agent
question agent
each new round
make a guess
produces answer
if not confident
)
# Guessing Agent -Responsible for making final guesses
guessing_agent =Agent(
name="guessing_agent",
model="gemini-2.5-flash",
instruction="You are an expert at making educated guesses in 20 Questions game",
description="""You analyze all the information gathered from previous questions and answers
to make the best possible guess about what the user is thinking of.
""",
output_schema=GuessOutput,
output_key="guess_output"
classGuessOutput(BaseModel):
guess: str=Field(..., description="The final guess for what the user is thinking of")
confidence: int=Field(..., description="Confidence level (1-10) in this guess")
reasoning: str=Field(..., description="Explanation of why this is the best guess. Summarise in
less than 20 words.")
this ishow you passdata between
agents
https://github.com/yoyu777/adk-game-demo-public/blob/main/question_agents/agent.py
)
output_schema=QuestionOutput,
output_key="question_output"
# Asking Agent -Responsible for generating strategic questions
asking_agent =Agent(
name="asking_agent",
model="gemini-2.5-flash",
instruction="You are an expert at asking strategic yes/no questions in 20 Questions game",
description="""You specialize in asking the most effective yes/no questions to narrow down
possibilities.
Your goal is to eliminate as many possibilities as possible with each question.
Consider categories like:
…
""",
classQuestionOutput(BaseModel):
question: str=Field(..., description="A strategic yes/no question to ask")
reasoning: str=Field(..., description="Explanation of why you ask this question. Summarise in less
than 20 words.")
https://github.com/yoyu777/adk-game-demo-public/blob/main/question_agents/agent.py
06
Custom agent
for orchestration
)
(custom logic)
yield event
class RootAgent(BaseAgent):
guessing_agent:Agent
asking_agent:Agent
def __init__(self,name:str,guessing_agent:Agent,asking_agent:Agent):
async def _run_async_impl(
self,ctx: InvocationContext
)-> AsyncGenerator[Event, None]:
super().__init__(
name=name,
guessing_agent=guessing_agent,
asking_agent=asking_agent,
sub_agents=[guessing_agent,asking_agent]
generating a series of events
overriding implementation
extending BaseAgent class
https://github.com/yoyu777/adk-game-demo-public/blob/main/question_agents/agent.py
asyncforevent inself.asking_agent.run_async(ctx):
yield event
asyncforevent inself.guessing_agent.run_async(ctx):
yield event
ifconfidence >=9:
logger.info("High confidence guess, proceeding to make guess")
yield self.create_text_response_event(dumps({
"action": "make_guess",
"guess": guess_output.get("guess"),
"reasoning": guess_output.get("reasoning")
}), invocation_id=invocation_id)
return
guess_output =ctx.session.state.get("guess_output", None)
confidence =guess_output.get("confidence") ifguess_output elseNone
calling the asking agent
deterministic logic
accessing session state
calling the guessing agent
https://github.com/yoyu777/adk-game-demo-public/blob/main/question_agents/agent.py
07
Summary
●ADK helps you build applications
●important concepts: Runners,
Sessions, Agents
●simple agent, multi-agent
& custom agent
Baku
we didn’t cover tools, workflow agents,using
custom models, which are also interesting.
Here is an excellent learning resource:
https://codelabs.developers.google.com/onramp
/instructions
https://codelabs.developers.google.com/onramp/instructions
https://github.com/yoyu777/adk-game-demo-public
Thank you
Workshop/Data/ML/Cloud/Mobile -
Alper Sari, Javid Aliyev
Deploying
resources via
Gemini CLI
gcloud
IAC REST
Resource Creation
IAP
cloud-shell
gemini-cli (available in Cloud Shell)
gcloud mcp
cloud-run mcp
Prerequisites
Part - 1
Claim the GCP Credit
Link with billing account and project
Install Gemini CLI’s gcloud MCP
Visit Prompt Generator
Part - 2
Create a VPC Network
Create a Subnet
Create a VM Instance and Install
NGINX
Check the server
Overview
Credit Link : trygcp.dev/claim/devfest-baku
Repo : github.com/alper-sari/geminicli-cloud-shell-tutorial
Links
Thanks a
lot!
Feedback Form
WebRTC
Javid Aliyev - @thinkingIT
Software Engineer
Catch the Bug Before It Blinks
Pro Testing for Front-End Devs
About Me
Global Companies I worked:
Processica (Branch of AWS)
Cymulate (Israel)
1. Why we need testing
Benefits of front-end testing
@GDG
Identifies bugs
Ensures Consistency
Cross-browser/device compatibility
Faster Development cycle
Scope for third-party integration
Example
EXAMPLE
@GDG
@GDG
Differences between be and fe testing
Backend Testing
Focuses on functionality of
the server and database
Ensures performant API’s
Does not require a browser
Frontend Testing
Focuses on interaction
between user and soft
Does not require a database
May require a browser
@GDG
@GDG
Unit testing:
Jest
Vitest
Mocha
Tools for each test phases
Performance Testing
Lighthouse (Chrome DevTools)
WebPageTest
K6
End-to-End (E2E) Testing
Playwright
Cypress
Selenium
WebdriverIO
Integration Testing
Vitest (component + store)
Jest
React Testing Library
Cross-Browser Testing
Playwright (Chromium, Firefox,
WebKit)
BrowserStack
Sauce Labs
Accessibility Testing
axe-core (industry standard)
jest-axe
Lighthouse Accessibility Audit
Pa11y
@GDG
Visual Regression Testing:
Playwright Snapshots
Chromatic (for Storybook)
Percy
Applitools Eyes
Loki
Tools for each test phases
Acceptance Testing
Cypress (business-flow
validation)
Playwright
Testim / QA Wolf (automated
acceptance frameworks)
@GDG
2. Testing tips
Linkedin @javid aliyev
Telegram @alyevv
Github @cavid-aliyev
Questions?

DevFest Baku 2025 Speaker Presentations ( Software, AI, Workshop tracks)

  • 1.
    Software Engineering Track- Levent Kantaroğlu , Jamal Hasanov , Amulya Bhatia
  • 2.
    Flutter Usage inFirebase Studio Levent Kantaroğlu
  • 3.
    Flutter Usage inFirebase Studio ● ● ● ● Flutter, Dart, Firebase Firebase Studio’nun Serüveni Firebase Studio’nun Yetkinlikleri Firebase Studio’da Flutter ile Geliştirme
  • 4.
    Flutter Usage inFirebase Studio
  • 5.
    Flutter Usage inFirebase Studio
  • 6.
    Flutter UI Toolkit Multi-PlatformFast Development Performance
  • 7.
    Flutter Usage inFirebase Studio
  • 8.
    Flutter Usage inFirebase Studio
  • 9.
    Flutter Usage inFirebase Studio
  • 10.
    Flutter Usage inFirebase Studio
  • 11.
  • 17.
  • 18.
    Flutter Usage inFirebase Studio Levent Kantaroğlu - Thanks -
  • 19.
  • 20.
    Running Processes -Many and Fast A talk on Linux, Concurrency and Web Servers Dr. Jamal Hasanov School of IT and Engineering
  • 21.
    Just printing #include <stdio.h> intmain() { printf("Hello, World!n"); } a function from C (libc) #include <stdio.h> #include <unistd.h> int main() { write(STDOUT_FILENO, "Hello, World!n", 14); } We also could use a lower-level function for the printing What is this then??? The number of symbols
  • 22.
    Just printing writing Source:https://manpages.debian.org/unstable/manpages-dev/write.2.en.html
  • 23.
    Just printing writing Source:https://manpages.debian.org/unstable/manpages-dev/write.2.en.html
  • 24.
    Writing to afile write(STDOUT_FILENO, "Hello, Worlds!n", 14); STDOUT_FILENO defines where the output text shall be directed to These are the descriptors from unistd.h: /* Standard file descriptors. */ #define STDIN_FILENO 0 /* Standard input. */ #define STDOUT_FILENO 1 /* Standard output. */ #define STDERR_FILENO 2 /* Standard error output. */
  • 25.
    Why a filedescriptor? fd = open(…) read(fd) write(fd,…) close(fd) D D D D D D D D Calling everything a file brings an abstraction – the same approach for different sources Does not matter where you write to: Disk Network A buffer of an embedded device Implementation of the abstraction is done through the drivers In Linux everything is considered as a file
  • 26.
    A process isa file too A process is an instance of a running program. Every time a program is launched or a command is executed, a new process with a unique ID (PID) is created. proc is a pseudo-filesystem (not stored on disk) that provides a view into the kernel’s internal data structures. procfs stands for Process File System, typically mounted at /proc. It is a virtual filesystem - its files and directories exist only in memory and are generated dynamically by the kernel.
  • 27.
    Each process hasits own directory /proc/<PID>/ containing details such as: Process files Purpose /proc/<PID>/cmdline Command-line arguments /proc/<PID>/cwd Symbolic link to the current working directory /proc/<PID>/exe Symbolic link to the executable /proc/<PID>/fd/ Open file descriptors /proc/<PID>/status General process info (UIDs, memory, state, etc.) /proc/<PID>/stat and /proc/<PID>/statm Numeric statistics (CPU time, memory usage, etc.) /proc/<PID>/environ Environment variables A process is a file too
  • 28.
    Let’s see thefiles of this process!
  • 29.
    Abstraction for communication Apseudo-terminal, often abbreviated as PTY, is a virtual device in Unix-like operating systems, including Linux. It functions as a pair of virtual character devices that establish a bidirectional communication channel between processes. PTY (dev/pts/ 1) Master device (ptm or ptmx) is controlled by a terminal emulator process or remote login server. Anything written to the master is given to the slave as input. Slave device (pts) acts exactly like a traditional hardware terminal (tty) device to the programs running within it. Programs write their output to the slave, which is then read by the master process. The SSH daemon process on the remote server manages the connection. It forwards the client keystrokes received over the network to the pty master and reads output from the master to send it back to the client. The shell that starts on the remote server and runs clients commands. It receives input from the pty slave, which is being fed by the SSH daemon.
  • 30.
    How to maketwo computers talk? Image source iStock and source
  • 31.
    Internet as afile Server Client
  • 32.
    Being blocked Read ona descriptor blocks if there’s no data available. fd = open(…) read(fd) Did somebody press a key? Did somebody move the mouse? Did somebody send a message over network? The same is true for write.
  • 33.
    Disk files areexception, since writes to disk happen via the kernel buffer cache, not directly. Being blocked fd = open(…) write(fd,…) Buffer cache The only time when writes to disk happen synchronously is when the O_SYNC flag was specified when opening the disk file.
  • 34.
    Non-blocking file descriptors Adescriptor can be put in the nonblocking mode by setting the O_NONBLOCK flag In this case, a call on that descriptor will return immediately, even if that request can’t be immediately completed. The return value can be either of the following: an error: when the operation cannot be completed at all a partial count: when the input or output operation can be partially completed the entire result: when the I/O operation could be fully completed
  • 36.
    Running more thanwe can Context switching is the process where the CPU saves the state of a running task and loads the state of another task to switch from one to another, enabling multitasking by giving the illusion that multiple processes are running simultaneously.
  • 37.
    Concurrency Concurrency refers tothe ability of a system to execute multiple tasks through simultaneous execution or time-sharing (context switching) Improved performance by executing multiple tasks in parallel. Better resource utilization, e.g., CPU and I/O devices are kept busy. Scalability: can handle more clients, requests, or tasks simultaneously. Modularity: concurrent tasks can be designed as independent components. Fault isolation: failures in one task may not crash the entire system. Motivations for concurrent applications:
  • 38.
    Multi-threaded vs Multi-process T1T2 T3 T4 Process Threads MEMORY Process MEMORY Process MEMORY Process MEMORY Process MEMORY Threads and the main process use the same memory space Each process uses its own memory space
  • 39.
    Combined model Image credit:M. van Steen and A.S. Tanenbaum, Distributed Systems, 4th ed., distributed-systems.net, 2023.
  • 40.
    Implementation options Image credit:medium post Mx1 - kernel is not aware of any of the user threads - there is only one thread/process in kernel which is serving the user-space scheduler. 1x1 – each user thread is mapped to one kernel thread MxN – a combination of Mx1 and 1x1: a kernel thread may serve an individual user kernel or the scheduler
  • 41.
    My first discoveryof Concurrency Case 1: Advanced Payment System Case 2: Billing System for Azeronline “Many years later, as he faced the firing squad, Colonel Aureliano Buendía was to remember that distant afternoon when his father took him to discover ice” Gabriel García Márquez One Hundred Years of Solitude
  • 42.
    Advanced Payment System 050123456Search Search number Name & Surname Make payments Operation result
  • 43.
    Advanced Payment System Your number please Your number please 111222 Are you John Smith? Are you Bob Adams? Sorry, I am Bob Sorry, I am John
  • 44.
    Concurrency problems Competition vsCooperation Race Condition A resource task task task task update A resource task task task task
  • 45.
    Concurrency solutions A resource Atask Mutex A resource A task Semaphore notify A resource A task Monitor Monitor A resource A task Non-blocking
  • 46.
    Concurrency Demo: oneprocessor involved Execution of the code Calc in Loop 1 Calc in Loop 2 Calc in Loop 3 Calc in Loop 4 Calc in Loop 5 Calc in Loop 6 Calc in Loop 7 Calc in Loop 8 Calculate total
  • 47.
    Execution of thecode Loop 1 Loop 2 Loop 3 Loop 4 Calculate total Concurrency Demo: multiple processors involved Loop 5 Loop 6 Loop 7 Loop 8
  • 48.
    A visual demo– Image Quantization
  • 49.
    A practical application– handling client requests SERVER But we can serve 6 of them at a time!
  • 50.
    A concurrency example ApacheWeb Server Control Process (Parent) Child Process Child Process Child Process Listener Threads Server Threads Web Clients
  • 51.
    Problems Threads are generallylighter than processes, a large number of concurrent threads can still consume significant memory and CPU resources, potentially leading to performance degradation under heavy load or with a high number of idle connections. Thread-per-connection model can encounter scalability limitations, especially when handling a massive number of concurrent, long-lived connections Creating, managing, and destroying a large number of threads can introduce overhead, especially under fluctuating or bursty workloads, potentially impacting overall performance.
  • 52.
    C10k problem Stated bysoftware engineer Dan Kegel in 1999 Stands for Concurrent 10,000 requests The C10k problem was the challenge of designing network servers that could handle 10,000 concurrent client connections
  • 53.
    Nginx story 2002 –Russian software engineer Igor Sysoev began developing Nginx to solve the “C10K problem” — handling 10,000 simultaneous connections efficiently. 2004 – Nginx was publicly released as open-source software under the BSD-like license. 2006–2010 – It gained popularity for its event-driven, asynchronous architecture, outperforming Apache in serving static content and handling high concurrency. 2017 – Surpassed Apache in market share for top 1000 busiest websites, marking a significant shift in web infrastructure.
  • 54.
    Clients Master Workers CPU Responsible for readingconfiguration, creating sockets, managing signals, and spawning workers. Does not handle client I/O. Each worker runs an independent event loop Accepts client connections and handles requests Each worker is single-threaded, non- blocking, and uses I/O multiplexing for thousands of concurrent connections.
  • 55.
    T1 T2 T3T4 Process Threads MEMORY Process MEMORY Process MEMORY Process MEMORY Process MEMORY Resources are naturally shared (in the same codebase) among the threads and process No common memory – none of the resources of one process is visible outside How to share the user requests, sockets and other resources?
  • 56.
    Process MEMORY Process MEMORY Process MEMORY Process MEMORY OS Kernel socket The descriptorof this socket (a file) is shared among processes A processes use OS to check is there is anything in the socket A socket is listening for the requests.
  • 57.
    I/O Multiplexing We showedhow a process handles I/O on a single descriptor. Often, a process might want to handle I/O on more than one descriptor. I/O multiplexing is a technique that allows a single thread to monitor and manage multiple I/O channels, such as network sockets, to handle many connections efficiently
  • 58.
    Blocking is nota solution for sure – we cannot keep waiting! Non-blocking approach might not be the right approach: When data is coming in very slowly the program will wake up frequently and unnecessarily, which wastes CPU resources. When data does come in, the program may not read it immediately if it's sleeping, so the latency of the program will be poor. Handling a large number of file descriptors with this pattern would become cumbersome. Solution: I/O multiplexing modes need to be considered Problem: Requesting multiple sources in an asynchronous mode
  • 59.
    I/O Multiplexing There areseveral ways of multiplexing I/O on descriptors: non-blocking I/O - the descriptor itself is marked as non-blocking, operations may finish partially signal driven I/O - the process owning the descriptor is notified when the I/O state of the descriptor changes polling I/O - with select or poll system calls, both of which provide level triggered notifications about the readiness of descriptors
  • 60.
    I/O Multiplexing non-blocking I/O Ifa process tries to perform I/O operations very frequently, the process has to continuously be retrying operations that returned an error to check if any descriptors are ready. Such busy-waiting in a tight loop could lead to burning CPU cycles. All file descriptors are set to non-blocking mode
  • 61.
    I/O Multiplexing Signals areexpensive to catch, rendering signal driven I/O impractical for cases where a large amount of I/O is performed. Kernel tracks a list of descriptors and sends the process a signal every time any of the descriptors become ready for I/O. signal driven I/O The kernel is instructed to send the process a signal when I/O can be performed on any of the descriptors.
  • 62.
    I/O Multiplexing polling I/O Theprocess uses the level triggered mechanism to request the kernel by a system call which descriptors are capable of performing I/O. There's a few I/O multiplexing system calls: select (defined by POSIX), the epoll on Linux, and the kqueue on BSD. These all work fundamentally the same way: they let the kernel know what events (typically read events and write events) are of interest on a set of file descriptors, and then they block until something of interest happens.
  • 63.
    epoll epoll is aLinux kernel system call for an I/O event notification mechanism epoll monitors multiple file descriptors to see whether I/O is possible on any of them. epoll uses a red–black tree (RB- tree) data structure to keep track of all file descriptors that are currently being monitored
  • 64.
    Edge-Triggered Polling (ET) Eventsare delivered only when the state transitions (from not-ready → ready). Example: you get a read event only when new data arrives, not while data remains unconsumed. Requires reading/writing until EAGAIN to avoid missing future events. More efficient because it avoids repeated notifications. More complex to implement correctly - mismanagement may lead to stalled I/O.
  • 65.
    Level-Triggered Polling (LT) Thedefault behavior in many polling APIs. A file descriptor is reported as “ready” as long as the condition remains true (e.g., there is unread data in the buffer). The application may receive repeated readiness notifications until the condition is cleared. Safer and simpler but may cause unnecessary repeated wake-ups.
  • 66.
    Nginx - FullScenario The master reads nginx.conf, creates listening sockets, binds to ports (e.g., 80, 443), and sets them as non-blocking. Then it forks the worker processes. Each worker inherits the listening socket file descriptors from the master. During its startup routine each worker: Initializes its event loop subsystem. Calls epoll_create(). Stores the returned epoll file descriptor for future event registration.
  • 67.
    Context Switch andPinning Context Switching in Multiprocessing The CPU rapidly switches between processes, pausing one and resuming another so they can “share” the same core. Each switch has a cost: the CPU must save the current state, load another one, and refill caches, which can slow systems down under heavy load. Nginx ties each worker process to a specific CPU core, avoiding unnecessary movement between cores. This improves performance and lowers latency, because the worker keeps its core’s warm caches and avoids extra context-switch overhead.
  • 68.
    Customer Service Analogy- Apache Only 4 computers available Hired 25 people to handle customer requests
  • 69.
    Customer Service Analogy- NGINX Only 4 computers available Only 4 employees – each with a dedicated PC Example: one customer may have 10- second work, while other may require an approve from the head office or credit system. pick the next task when they have a time (a previous task is waiting or something)
  • 70.
    Another asynchronous eventloop model – Node.js
  • 71.
    Concurrency in Functional Programminglanguages Immutability simplifies concurrency No shared mutable state: fewer race conditions than in imperative code. Pure functions parallelize naturally No side effects: tasks run safely on different cores without coordination. Actor-based concurrency Many FP languages use message-passing (Erlang, Elixir, Scala/Akka) instead of shared memory. Higher-level concurrency primitives Futures, promises, parallel map, and lazy streams reduce the need for locks. Icon source
  • 72.
    Recommendations Learn Linux It isa monolithic OS – check for the alternatives What will be the OSs in quantum computers? 1 Learn Concurrency programming 2 Learn Low-level programming You cannot get ready for Quantum if you don’t understand hardware well 3
  • 73.
  • 74.
    References What is epoll?A Medium post Non-blocking I/O – A Medium post Blocking I/O, Nonblocking I/O, And Epoll (link)
  • 75.
    1 Hey all, I’m AmulyaBhatia! Chief Architect Gen AI SME GDE They/Them My ADK Book LinkedIn
  • 76.
    Firebase Studio isan integrated and extensible agentic workspace to build, run, and manage web apps, cross-platform mobile apps, backend services, and more.
  • 77.
    Web-based Full cloud VM DeepGoogle Integrations Preconfigured Environments Live preview Designed for collaboration Workspace sharing available with more features planned. AI Assistance Built on VS Code Yourworkspaceisa URL, installable as a PWA. Supports most toolchains. Startfromcommon starting points or customize your own. Streamline app dev workflows. Across code, test, debugging, etc. Realdevice,emulator, or hosted IFRAME. World-class code editing. super_g_logo supe
  • 78.
  • 79.
  • 80.
  • 81.
  • 82.
  • 84.
    Agentic Experiences Throughout Our advanced codingcapabilities are enhanced with agentic experiences that take complex (or boring) actions on behalf of you. Whether the changes need to happen across a section of code, a single file, or an entire code base, Gemini will understand your intent and accomplish the task.
  • 85.
    AI-centric View Code-centricView We want to ensure you all have a choice in using as-much or as-little AI as you want when building your apps
  • 86.
    Share and Collaborate in Real-Time Notonly can you share thedeployed link, you can share the entire workspace with a URL. This means you can collaborate in real-time within the same Firebase Studio environment, and then push updates instantly.
  • 87.
  • 88.
    App Prototyping Agent Tobuild our full stackweb application, we can startwith a natural language prompt and it will create a PRD forusto review. After we modify the proposal as needed we can generate the app and iterate in chat to update and rebuild the application.
  • 89.
    We can prototypequickly to get a Next.js app with Genkit for agentive features that is connected to Firebase!
  • 90.
    Quickly Deploy toFirebase We can quickly deploy our application we created to Firebase App Hosting. The wizard will guide you on picking the correct project, billing account and kick of the new or updated deployment.
  • 91.
    Firebase Studio supportsexisting codebases and any stack that canbe installed with Nix (120k+ packages)
  • 92.
    Exploring the Code Everyworkspace is backed by full IDE that we can edit the generated code. a There is a full VM backing Firebase Studio so you can run commands that would usually fail in the browser.
  • 93.
    Editing the codewith Gemini In the IDE Code view of Firebase Studio we can still have the full power of Gemini in the workspace. Gemini can read and write files and run terminal commands with the full context of the project, recently opened files and any attachments sent.
  • 94.
    PropProieptarireyt a+r CyoandfidCeo a lightweight, powerful, and accessible CLI tool that integrates cutting-edge AI directly into the terminal.
  • 95.
    Proprietary + Confidential GeminiCLI Code and Files Invoke tools Coordinate with other apps Coordinate with other applications like VS Code or Chrome, performing actions or gathering context Comprehensive Context Including project files, data, and potentially even screen sharing – to provide the most relevant and effective assistance Generate code and manage files (the core CLI function) Intelligently invoke other developer tools, MCP support, to manage local development, run tests, interact with cloud services, etc
  • 96.
    Proprietary + Confidential GeminiCLI Architecture User via Terminal packages/cli packages/core Tools / MCP Servers Gemini/Code Assist API User sends tasks/prompts via terminal and receives outputs and requests (to perform actions via tools) Sends user requests/confirmations to the core and receives tools details and final outputs from the core Sends prompts/tools info to Gemini Executes Tools and receives results Local Tools (file, shell, web) Local and Remote MCP Servers
  • 97.
    Proprietary + Confidential Interactiveand File/Shell Integration / commands @ context ! shell
  • 98.
    Proprietary + Confidential PowerfulBuilt-In Tools File system tools (read_file, write_file, list_directory, search_file_content), Shell tool (run_shell_command) Web tools (web_fetch, google_web_search)
  • 99.
    Proprietary + Confidential HierarchicalContext & Memor y Project-specificandglobal instructions to the AI using GEMINI.md files
  • 100.
    Proprietary + Confidential SecureSandboxing Run in isolated environment using technologies like Docker, Podman, or macOS's native Seatbelt functionality
  • 101.
    Proprietary + Confidential CustomCommands and MCP Createcustom slashcommandsfor frequently used prompts. Model Context Protocol (MCP) servers
  • 103.
    32 Gen AI Toolboxbenefits Bettermanageability,observability,andsecurityfor your gen AI agents Reduced boilerplate code to simplify tool development Efficient connection pooling and optimized connectors for databases Configuration-driven approach enables deployment without interruption Zero downtime Enhanced security Better performance Simplified development End-to-end observability Integrated with Google Cloud monitoring and tracing Provides simple patterns for integrating user authentication
  • 104.
    Toolbox flow 1 AccessDatabase � � Specify URI 2 Define tools Load Tools 3 Invoke Tool
  • 105.
    tools.yaml ● Configuration (tools.yaml) ○ ○ ○ ○ Usersdefineseveralresources in a file 3 sections: Sources, Tools, Toolset Each is defined in a map of name → object definition Toolbox loads them on start up, builds appropriate APIs Source Tools Toolset Cloud SQL AlloyDB Postgres interacts with one of postgres-sql ( user defined )
  • 106.
    Define sources sources: # Thistool kind has some requirements. # See https://github.com/googleapis/genai-toolbox/blob/main/docs/sources/cloud-sql-pg.md#require ments my-cloud-sql-source: kind: cloud-sql-postgres project: my-project-name region: us-central1 instance: my-instance-name user: my-user password: my-password database: my_db Sources represent a source of data that a tool can use. Typically they encapsulate _how_ a database is connected to – IP address, credentials, etc.
  • 107.
    Define Tools tools: get_flight_by_id: kind: postgres-sql source:my-cloud-sql-source description: > Use this tool to list all airports matching search criteria. Takes at least one of country, city, name, or all and returns all matching airports. The agent can decide to return the results directly to the user. statement: "SELECT * FROM flights WHERE id = $1" parameters: - name: id type: int description: 'id' represents the unique ID for each flight. Tools are an action, typically executed on a source. These include description of how/when to take the action, and things like which parameters are specified.
  • 108.
    Define Toolsets toolsets: my_first_toolset: - my_first_tool -my_second_tool my_second_toolset: - my_second_tool - my_third_tool Tool sets are logical groups of tools. You can load all tools or tools by toolset to pass into an agent. # This will load all tools all_tools = await client.load_toolset() # This will only load the tools listed in 'my_second_toolset' my_second_toolset = await client.load_toolset("my_second_toolset")
  • 109.
  • 112.
    AI TRACK -Mirakram Aghalarov & Zahra Bayramli, Nishi Ajmera, Kamran Huseynov & Tarlan Huseynov,Nikhilesh Tayal , Yoyu Li
  • 113.
    XXXXX XXXXX XXXXX XXX XX XX XX XX XX XX XX XX XX XX XXXXX XXXXX XXXXX XXXXX XXXXX X WHEN LLM STAYSHOME: BUILDING KNOWLEDGE SYSTEMS ON PREMISES Mirakram Aghalarov Zahra Bayramli </ />
  • 114.
    </SPEAKERS Mirakram Aghalarov ZahraBayramli Senior Deep Learning Engineer @SOCAR CIC Lecturer of AI&ML courses at BHOS MSc Data Science and Engineering Graduate from Politecnico Di Torino Deep Learning Engineer @SOCAR CIC BSc Computer Science Graduate from Korean Advanced Institute of Science & Technology (KAIST)
  • 115.
    //Table of contents {02} {01} {03} CloudInfrastructure On-Premise Infrastructure Hybrid Solution
  • 116.
  • 117.
    </AI Start UpsWorldwide 2024 2025 ~92000 ~212000
  • 118.
    Delivery Hero SEis a German multinational online food ordering and food delivery company based in Berlin, Germany. REI is a member-owned retailer that sells outdoor gear, promotes sustainable adventure and customer specific design using GenAI. </Distribution over Cloud
  • 119.
    The AI developerplatform to build AI agents, applications, and models with confidence Accelerating AI development and deployment with a secure collaboration platform for AI developers and data providers </Distribution over Cloud
  • 120.
    Deepset is anenterprise software vendor that provides developers with the tools to build production-ready Artificial Intelligence and natural language processing systems. Robin AI is a legal-tech company that provides an AI-powered platform to review, analyze, and manage contracts far more quickly and securely. </Distribution over Cloud
  • 121.
    </How to buildsimple Chatbot Vertex AI Studio Vertex AI Search Agent Garden Ray
  • 122.
    </How to buildsimple Chatbot
  • 123.
    </How to buildsimple Chatbot GCP provides easy to use pipeline builder to establish the flow. To build simple RAG agent Vertex studio helps to integrate the chatbot with Vertex AI Search capabilities based on selected Vector Database
  • 124.
    </How to buildsimple Chatbot AI Document Intelligence AI Content Understanding OpenAI Model Registry Azure Machine Learning
  • 125.
    </How to buildsimple Chatbot
  • 126.
    </How to buildsimple Chatbot Amazon Sagemaker Amazon Bedrock Amazon Lambda AWS S3 Amazon Kendra Amazon API Gateway
  • 127.
    </How to buildsimple Chatbot Amazon Sagemaker Amazon Bedrock Amazon Lambda AWS S3 Amazon Kendra Amazon API Gateway
  • 128.
  • 129.
  • 130.
  • 131.
    </Agentic AI My dear,Tell me about weather in our 4th plant What a marvellous question! I am going to check what is the location of “our 4th plant” … Okay! Now location of plant with ID 4 is known. SQL call to the database
  • 132.
    </Agentic AI My dear,Tell me about weather in our 4th plant What a marvellous question! I am going to check what is the location of “our 4th plant” … Okay! Now location of plant with ID 4 is known. SQL call to the database Now lets look at the search on services about the weather information Answer is 26 Degree Celcius Searching through the Web
  • 133.
  • 134.
  • 135.
    </Cloud Evaluation Very LowDowntime High Scalability Mature Solutions Low Code Environment Faster Deployment No CAPEX Higher OPEX Network Bandwidth Limitation Uncontrollable Cost Not Suitable for Real time Limited Privacy
  • 136.
  • 137.
    </Azerbaijan Situation Limited Privacy AzerbaijaniLegislation does not allow private information to be used in cloud infrastructure due to location of them. Cyber threats, privacy insurance does not let any sensitive information to be sent to outside border of Azerbaijan
  • 138.
    </Azerbaijan Situation Limited Privacy Highertime for MVP Higher requirements to accomplish the task Less number of startups
  • 139.
  • 140.
    Data Privacy andCompliance - Sensitive documents, internal systems </Need for On-prem Restricted Environments - Factories, labs, government facilities with no external internet access Cost - Cloud APIs become expensive with higher usage Full Customization and Control - Ability to retrain, fine-tune, or modify models without vendor limits
  • 141.
    </RAG Architecture Example DocumentIngestion Embedding Generation Vector Database Vector Representation Text/Table Extraction Chunking Embedding Model Stores embeddings and allow fast similarity search 0.2 0.7 0.5 … … … Optical Character Recognition
  • 142.
    </RAG Architecture Example RetrieverLLM (on-prem) Application Layer Vector Search Hybrid Search Reranker Prompt + Retrieved Chunks UI/Chat Interface FastAPI/Tools Output
  • 143.
  • 144.
  • 145.
  • 146.
  • 147.
  • 148.
    </Challenges with Open-SourceOCR Complex Tables Merged cells, nested tables VLMs Offline Large, slow, proprietary Handwritten documents Cloud APIs > open source Azure AI Document Intelligence Google Vision API - High VRAM Requirements - No-fine tuned OCR pipelines - Latency too high without GPU clusters
  • 149.
    </Vector Database Core Componentsof On-Prem Retrieval Local Vector Database
  • 150.
    </Vector Database Core Componentsof On-Prem Retrieval Local Vector Database Embedding Model bge-m3 E5-large Nomic-embed Jina v2 MiniLM Instructor-xl
  • 151.
    </Vector Database Core Componentsof On-Prem Retrieval Local Vector Database Embedding Model Reranker bge-m3 E5-large Nomic-embed Jina v2 MiniLM Instructor-xl bge-reranker-large colbert
  • 152.
    </Vector Database Capabilities Missing HybridSearch - Cloud versions tune this automatically; on-prem requires manual scoring & fusion Premium Features - E.g. Reciprocal Rank Fusion (RRF) in ES not available on-premise; requires custom implementation Enterprise-Grade Monitoring & Analysis - Cloud dashboards show slow queries, index health, drift detection etc.
  • 153.
    </Challenges with Open-SourceModels Sensitivity to Noisy Input - DeepSeek/Qwen degrade more on messy documents Mtng w/ team @ 4pm pls updte repprt Q3 target = 1O0K ??? ask Sam re: buget Meeting with the team at 4pm. Update the report. Q3 target is 100K. Ask Sam about the budget The meeting is about 4pm and a report, possibly about budget. Not sure what target means. Llama Qwen
  • 154.
    </Challenges with Open-SourceModels Sensitivity to Noisy Input - DeepSeek/Qwen degrade more on messy documents Context Size Expansion - Big documents ⇨ huge prompts ⇨ slower inference Quality & Alignment - More hallucination, low quality answers
  • 155.
    </Agentic AI On-Prem ModelContext Protocol (MCP) - On-prem LLMs often not aligned for tool calls MCP HOST MCP Server A MCP Server B MCP Server C MCP Protocol MCP Protocol MCP Protocol Local Files Storage External APIs & Apps Remote Database
  • 156.
    </Agentic AI On-Prem UnreliableFunction Calling - No built-in “tool calling” alignment like ChatGPT or Gemini ReACT Agents do not work - Lose state across steps ⇨ repeat wrong calls. Produce inconsistent “thought/action” formatting. Hallucinate tool names or steps
  • 157.
    </Agentic AI On-Prem KnowledgeGraph Integration Entity Extraction Database Graph DB Entities and relationships Agent Graph Query Generation Output LLM Result User Query Graph DB
  • 158.
    </Agentic AI On-Prem KnowledgeGraph Integration - LLM extracts structured entities and relations using long rule-based prompts - Recursive multi-pass extraction across chunks
  • 159.
    </GPU infra: Inference& Deployment TensorRT-LLM - Provides 2-4x faster throughput compared raw pytorch - Provides better parallelism - Supports FP8/INT4 quantization to reduce memory usage - Requires custom engine building based on GPU model
  • 160.
    </GPU infra: Inference& Deployment vLLM / TGI - High-throughput distributed serving - Good batching & streaming performance - Supports HF models & OpenAPI standards - Requires full GPU stack Ollama / llama.cpp - Easy setup, simple single-node runtime - Supports GGUF quantized models, small memory footprint - Limited scaling & lower token throughput
  • 161.
    </On Premise Evaluation HighPrivacy Limited Scalability High CAPEX High Downtime Higher Development Time Faster Communication Real-time Streaming Full-control over Cost More Customization
  • 162.
    </Best of 2Worlds High Privacy Faster Communication Real-time Streaming Full-control over Cost More Customization Very Low Downtime High Scalability Mature Solutions Low Code Environment Faster Deployment
  • 163.
    Hybrid is PossibleChoice Video Stream Text Stream Data Stream Clean my code! Now! Blurring Masking Anonimization
  • 164.
    Hybrid is PossibleChoice Blurring Masking Anonimization LLM Cloud Tool 1 Tool 2 Tool 3 Demasking
  • 165.
    Hybrid is PossibleChoice Tool 3 Demasking LLM Answer
  • 166.
    //Conclusion It is noteasy to keep the pace of the development in Azerbaijan compared to the Worldwide scale. Only way out is to understand the reasons behind limitations and find bypass to eliminate the disadvantages of both sides
  • 167.
    CREDITS: This presentation templatewas created by Slidesgo, and includes icons by Flaticon, and infographics & images by Freepik //Thanks, Do you have any questions?
  • 168.
    Using MCP, A2Aand ADK Building Conversational Agents
  • 169.
  • 170.
    A software entitydesigned to act autonomously to achieve specific goals Performs tasks, interacts with users, utilizes external tools Goes beyond simple input/output – they can reason, plan, and orchestrate What is an AI Agent ?
  • 171.
    AI systems thatuse natural language to interact with users and complete tasks What are Conversational Agents ? User Coding Assistant Assistant (Siri, Alexa) Support Bot
  • 172.
    Modularity: Break downcomplex problems into smaller, manageable agent tasks Specialization: Create expert agents for specific functions (e.g., a "billing agent," a "research agent") Collaboration: Agents can work together, delegating tasks and sharing information Scalability & Maintainability: Easier to update, debug, and scale individual components. From simple tasks to complex workflows Agentic Architectures
  • 173.
    Agent Development Kit (ADK) Aflexible and modular framework for developing and deploying AI agents.
  • 174.
    Key Goals Make agentdevelopment feel like software development. Simplify creation, deployment, and orchestration. Core Principles Model-agnostic (Optimized for Gemini, but supports others via LiteLLM). Deployment-agnostic (Local, Cloud Run, Agent Engine). Compatibility with other frameworks (e.g., LangChain, CrewAI). Agent Development Kit
  • 175.
    Model Context Protocol MCPstandardises the way AI models and tools communicate and share context
  • 176.
  • 177.
  • 178.
    Agent to AgentProtocol Enables seamless communication and coordination between multiple AI agents
  • 179.
  • 180.
  • 185.
  • 186.
    MCP vs A2A Aspect MCP(Model Context Protocol) A2A (Agent2Agent Protocl) Purpose Agent ↔ Tools/Resources Agent ↔ Agent Collaboration Communication Client-Server (function-like) Peer-to-Peer (conversational) State Stateless (tools as functions) Stateful (task lifecycle) Best For Accessing tools, APIs, databases Multi-agent coordination
  • 187.
    Agent as aTool ? When to Use Agent-as-Tool (MCP): Single orchestrator architecture Short-to-medium tasks (minutes to ~1 hour) Need tight control over workflow Simple request-response patterns You need deterministic, structured interactions When to Use A2A: Long-running tasks (hours to days) Peer collaboration and negotiation Dynamic agent discovery Multi-vendor ecosystems
  • 188.
    MCP ensures agentshave the right context and tools to operate efficiently, while A2A enables seamless collaboration. Together, they create a powerful, interoperable AI ecosystem
  • 189.
  • 190.
  • 191.
    Agentic Workflows withAWS DevFest 2025 Tarlan Huseynov DevOps Engineer AWS Community Kamran Huseynov AI Engineer
  • 199.
    PHASE1: Chunking &Embedding & Storing PHASE 2: Semantic Search / Embedding Inference + Similarity Search PHASE 3: Augmented Generation & Response Semantic Search and Retrieval in RAG Pipeline: embed → search → retrieve → generate
  • 200.
  • 201.
    Strands Agents Solution StrandsAgents – AWS-backed Production-ready agent workflows, model driven & composable & emphasizes simplicity MCP – Modular tool servers, scalable & observable Self-Managed provisioning for logical stack (ECS + Fargate)
  • 202.
    Building With StrandsAgents GitHub
  • 210.
  • 211.
  • 212.
    AI Agents LLMs + reasoning, +external applications, + self-reflectioncapabilities
  • 213.
    Do you need an AIAgent ? Give me 10 ideas for my Twitter post on AI Prepare a report of top AI research papers Send Sachin a leave request and update my calendar accordingly. Book the cheapest flight from Delhi to Dubai Translate this paragraph from Hindi to English Write an email requesting leave in a polite tone
  • 214.
  • 215.
  • 216.
    AI Agent withGoogle Search
  • 217.
    How the modelwas thinking
  • 218.
  • 219.
    AI Agents actuallydon’t have any memory
  • 220.
    Memory has tobe handled outside LLM
  • 221.
    AI Agent Memory Memory isimportant for conversations
  • 222.
  • 223.
  • 224.
  • 225.
  • 226.
    AI Agent witha conversation memory
  • 227.
  • 228.
  • 229.
  • 230.
  • 231.
  • 232.
  • 233.
    AI Agent withself managing memory
  • 234.
    Tool for memorymanagement
  • 235.
  • 236.
  • 237.
    There are differentkinds of “Memory”
  • 238.
    AI Agent Evaluation How toevaluate the chaos?
  • 239.
    Can you identify,what went wrong here?
  • 241.
    What could gowrong with AI Agents
  • 242.
    What could gowrong with AI Agents
  • 243.
  • 244.
  • 245.
    Suite of LLMEvaluation Methods
  • 246.
    Suite of LLMEvaluation Methods
  • 247.
  • 248.
    Evaluation Techniques -Code based evals -LLMas a judge -Human annotations
  • 249.
  • 250.
  • 251.
  • 252.
  • 253.
    LLM as ajudge
  • 254.
    LLM as ajudge –Important considerations
  • 255.
  • 256.
    Elements to evaluate -Tool choice -Generation -Path choice
  • 257.
    If your Agent’soutput is correct, does the trajectory matter?
  • 259.
  • 260.
    Google ADK –availableas a Python SDK
  • 261.
  • 262.
  • 263.
    Google ADK –AgentDevelopment Kit
  • 264.
  • 265.
    Happy to connecton LinkedIn -Founder at “AI ML etc.” -AI Instructor at LinkedIn Learning -Google Developer Expert for AI -IIT Kharagpur alumnus
  • 266.
  • 267.
    Creative Technology Director,Infinite Whys Baku Beyond the Prompt AnAnatomy ofan AI-Powered GameUsing Agent Development Kit Yoyu Li 2025
  • 268.
    What we aregoing to talk about ● ● ● ● ● what ADK is, and why (spoiler: Agent Development Kit) a game demo a simple agent the multi-agent pattern a custom agent
  • 269.
    01 what is ADK*? andwhy? *Agent Development Kit
  • 270.
    LLM + PromptLLM + Retrieval LLM + Retrieval + Tools + Many Tools + Reasoning Loop Agent Multi Agent Systems LLM LLM RAG LLM RAG Tools LLM Tools RAG An Evolution AI Agents
  • 271.
    Tools Agent runtime Orchestration Agent Definitions models Model(s) Orchestration Executethe stepsof the LLM Functions APIs Databases Query Model-based reasoning/planning and task execution loop Profile, goals, instructions, tools, … Response derived plan, to accomplish given tasks. This includes tool invocations and maintenance of intermediate state. Usedtoreason over goals, determine the plan and generate a response Tools Fetch data, perform actions or transactions by calling other APIs or services Memory Short-term Long-term (AnAgent can usemultiple models) GenerativeAI Keycomponents AI Agents reason,plan,and executetasks for users End user
  • 272.
    Baku Is there somethingmore ? personal ADK helps us build a good architecture for AI-powered applications. Having specialised agents means we can potentially use smaller/more efficient AI, or even no AI. And we should onlyuse Gen AI when it makes sense to do so.
  • 273.
    02 Demo time in Python Asimplegamebuilt
  • 274.
    (this is fordemo only, therefore not a very sophisticated application)
  • 275.
    03 An anatomy What isunder the hood
  • 276.
    question agents folder validationagent folder file structure main.py ui_components.py game_controller.py .env file adk_runners.py test_runners.py agent.py agent.py display layer control layer ADK runners & sessions Simple Agent Multi Agents Git repo: https://github.com/yoyu777/adk-game-demo-public
  • 277.
    04 a simple LLMagent to validate the user input
  • 278.
  • 279.
    create the Agent creating theagent using ADK command line interface > adk create [agent_name]
  • 280.
    root_agent =LlmAgent( from pydanticimportBaseModel,Field from google.adk.agents import LlmAgent class ValidationOutput(BaseModel): is_valid: bool=Field(..., description="Indicates if the input is a valid object for the game.") reason: str=Field(..., description="Explanation of why the input is valid or not.") model="gemini-2.5-flash", # Or your preferred Gemini model instruction="You are incharge of validate user's initial input", description="""The user is going to play a game of 20 Questions. Before starting the game, you need to validate the user's input to ensure it is a valid object for the game. A valid object should be a noun like the name of an object, an animal, or a concept, and not too obscure. For example, "cat", "car", "apple" are valid, but "quantum entanglement" or "the number seven" are not. """, output_schema=ValidationOutput ) name="validation_agent", https://github.com/yoyu777/adk-game-demo-public/blob/main/validation_agent/agent.py
  • 281.
    debug the Agent debugging theagent in browser debugging the agent using ADK command line interface > adk web > adk run [agent_name]
  • 282.
    question agents folder validationagent folder Call the agent in the game main.py ui_components.py game_controller.py .env file adk_runners.py test_runners.py agent.py agent.py display layer control layer ADK runners & sessions Simple Agent Multi Agents
  • 283.
    Runners, Sessions &Agents Read more about ADK Runtime: https://google.github.io/adk-docs/runtime/
  • 284.
    self.validation_agent_runner=Runner( agent=validation_agent, session_service=session_service, app_name=app_name ) logger.info("Validation agent initialized") return asyncdef initialise_validation_agent(self): session_service =InMemorySessionService() app_name="ValidationAgent" self.validation_agent_session=await session_service.create_session( app_name=app_name, user_id="test_user" ) https://github.com/yoyu777/adk-game-demo-public/blob/main/adk_runners.py
  • 285.
    if(event.is_final_response()): logger.debug(event.content.parts[0].text) return json.loads(event.content.parts[0].text) else: pass except Exceptionas e: logger.error(f"Error during validation: {e}") return None async def validate_input(self,user_input="elephant"): try: asyncforevent inself.validation_agent_runner.run_async( user_id="test_user", session_id=self.validation_agent_session.id, new_message=types.Content(role='user', parts=[types.Part(text=user_input)]) ): https://github.com/yoyu777/adk-game-demo-public/blob/main/adk_runners.py
  • 286.
  • 287.
    the guessing agents LLMagent guess agent custom agent orchestrator LLM agent question agent each new round make a guess produces answer if not confident
  • 288.
    ) # Guessing Agent-Responsible for making final guesses guessing_agent =Agent( name="guessing_agent", model="gemini-2.5-flash", instruction="You are an expert at making educated guesses in 20 Questions game", description="""You analyze all the information gathered from previous questions and answers to make the best possible guess about what the user is thinking of. """, output_schema=GuessOutput, output_key="guess_output" classGuessOutput(BaseModel): guess: str=Field(..., description="The final guess for what the user is thinking of") confidence: int=Field(..., description="Confidence level (1-10) in this guess") reasoning: str=Field(..., description="Explanation of why this is the best guess. Summarise in less than 20 words.") this ishow you passdata between agents https://github.com/yoyu777/adk-game-demo-public/blob/main/question_agents/agent.py
  • 289.
    ) output_schema=QuestionOutput, output_key="question_output" # Asking Agent-Responsible for generating strategic questions asking_agent =Agent( name="asking_agent", model="gemini-2.5-flash", instruction="You are an expert at asking strategic yes/no questions in 20 Questions game", description="""You specialize in asking the most effective yes/no questions to narrow down possibilities. Your goal is to eliminate as many possibilities as possible with each question. Consider categories like: … """, classQuestionOutput(BaseModel): question: str=Field(..., description="A strategic yes/no question to ask") reasoning: str=Field(..., description="Explanation of why you ask this question. Summarise in less than 20 words.") https://github.com/yoyu777/adk-game-demo-public/blob/main/question_agents/agent.py
  • 290.
  • 291.
    ) (custom logic) yield event classRootAgent(BaseAgent): guessing_agent:Agent asking_agent:Agent def __init__(self,name:str,guessing_agent:Agent,asking_agent:Agent): async def _run_async_impl( self,ctx: InvocationContext )-> AsyncGenerator[Event, None]: super().__init__( name=name, guessing_agent=guessing_agent, asking_agent=asking_agent, sub_agents=[guessing_agent,asking_agent] generating a series of events overriding implementation extending BaseAgent class https://github.com/yoyu777/adk-game-demo-public/blob/main/question_agents/agent.py
  • 292.
    asyncforevent inself.asking_agent.run_async(ctx): yield event asyncforeventinself.guessing_agent.run_async(ctx): yield event ifconfidence >=9: logger.info("High confidence guess, proceeding to make guess") yield self.create_text_response_event(dumps({ "action": "make_guess", "guess": guess_output.get("guess"), "reasoning": guess_output.get("reasoning") }), invocation_id=invocation_id) return guess_output =ctx.session.state.get("guess_output", None) confidence =guess_output.get("confidence") ifguess_output elseNone calling the asking agent deterministic logic accessing session state calling the guessing agent https://github.com/yoyu777/adk-game-demo-public/blob/main/question_agents/agent.py
  • 293.
  • 294.
    ●ADK helps youbuild applications ●important concepts: Runners, Sessions, Agents ●simple agent, multi-agent & custom agent Baku we didn’t cover tools, workflow agents,using custom models, which are also interesting. Here is an excellent learning resource: https://codelabs.developers.google.com/onramp /instructions https://codelabs.developers.google.com/onramp/instructions https://github.com/yoyu777/adk-game-demo-public
  • 295.
  • 296.
  • 297.
  • 298.
  • 299.
    cloud-shell gemini-cli (available inCloud Shell) gcloud mcp cloud-run mcp Prerequisites
  • 300.
    Part - 1 Claimthe GCP Credit Link with billing account and project Install Gemini CLI’s gcloud MCP Visit Prompt Generator Part - 2 Create a VPC Network Create a Subnet Create a VM Instance and Install NGINX Check the server Overview
  • 301.
    Credit Link :trygcp.dev/claim/devfest-baku Repo : github.com/alper-sari/geminicli-cloud-shell-tutorial Links
  • 302.
  • 304.
    WebRTC Javid Aliyev -@thinkingIT Software Engineer Catch the Bug Before It Blinks Pro Testing for Front-End Devs
  • 305.
  • 306.
    Global Companies Iworked: Processica (Branch of AWS) Cymulate (Israel)
  • 308.
    1. Why weneed testing
  • 309.
    Benefits of front-endtesting @GDG Identifies bugs Ensures Consistency Cross-browser/device compatibility Faster Development cycle Scope for third-party integration
  • 310.
  • 311.
  • 312.
    @GDG Differences between beand fe testing Backend Testing Focuses on functionality of the server and database Ensures performant API’s Does not require a browser Frontend Testing Focuses on interaction between user and soft Does not require a database May require a browser
  • 313.
  • 314.
    @GDG Unit testing: Jest Vitest Mocha Tools foreach test phases Performance Testing Lighthouse (Chrome DevTools) WebPageTest K6 End-to-End (E2E) Testing Playwright Cypress Selenium WebdriverIO Integration Testing Vitest (component + store) Jest React Testing Library Cross-Browser Testing Playwright (Chromium, Firefox, WebKit) BrowserStack Sauce Labs Accessibility Testing axe-core (industry standard) jest-axe Lighthouse Accessibility Audit Pa11y
  • 315.
    @GDG Visual Regression Testing: PlaywrightSnapshots Chromatic (for Storybook) Percy Applitools Eyes Loki Tools for each test phases Acceptance Testing Cypress (business-flow validation) Playwright Testim / QA Wolf (automated acceptance frameworks)
  • 316.
  • 317.
  • 318.
    Linkedin @javid aliyev Telegram@alyevv Github @cavid-aliyev Questions?