Watch the key takeaways in the video above and read more here.

From January 15 to February 16, 2024, we conducted a survey of 780+ technical professionals about how they’re using data when planning, building, and maintaining applications. You can access the raw results here.

Why A Survey on Data?

Let’s face it, ‘data’ doesn’t exactly scream “blockbuster thriller.”

Data Amusing

Funny thing, though – it’s this very unglamorous data that quietly runs the show behind every app, every click, and every digital interaction we make.

In our little tech bubble, it’s all about the shiny new toys – the latest frameworks, the AI breakthroughs. Everyone buzzes about React, Vue, Next, Nuxt, Tensorflow…But when did you last hear a fiery debate over SQL vs NoSQL?

Truth be told, we didn’t know what to expect with this survey. But what do you know? Responses started flooding in.

One month and 782 responses later, here we are. Data’s not dull; it’s just been waiting for its moment in the spotlight.

Methodology and Delivery of the Survey

Audience and Delivery

For conducting this survey, we partnered with our friends at Bytes Newsletter for promotion.

We wanted to capture a large segment of developers/engineers that are unfamiliar with our Directus offering, to ensure that the data is as unbiased as possible.

The survey itself was built and delivered via Typeform on an external domain not hosted by Directus.

Questions and Themes

Directus Developer Relations and Marketing teams developed the survey questions, with oversight provided from the Directus core engineering team.

Question formats are a mix of single-choice, multiple-choice, and open-ended. The survey consists of 40 questions and takes, on average, 13 minutes to complete.

Timeline and Incentives

We made the survey publicly available on January 15, 2024 and closed on February 16, 2024. Respondents were offered a chance to opt-in as an entry for 1 of 5 gift cards ($700 total in gift card prizes).

Respondent firmographics

At the beginning of the survey, we asked respondents a series of questions to validate their background and experience level. As expected, there were a wide range of responses.

For the purpose of this report, we decided to focus primarily on three cross-sections of data:

Respondent location
Organization size
Years of experience in data

This approach provided the most interesting insights when parsing the data for each category.

Respondent Location

Question: “Which region are you based in?”

Various locations have a wide range of data processing regulations – from GDPR in Europe to state-based considerations like the CCPA in the U.S. – that determine how they collect, store, retrieve, and maintain data.

We had a wide-range of respondents, so to consolidate with respect to numbers, we’ve grouped them into three buckets, with Europe accounting for nearly half of all responses.

Europe – 48% (378 responses)
North America – 27% (209 responses)
All Else (Asia, Oceania, South America, Africa, and other) – 25% (195 responses)

Organization Size

Question: What is your organization’s size?

The size of an organization often plays a part in how data is handled. Typically, smaller organizations have more freedom with using data, while larger organizations tend to have layers of checks and balances.

We had a wide range of respondents, so we decided to group these into similar-sized buckets, as well.

Small Organizations (1-49 employees) – 45% (349 responses)
Medium Organizations (50-499 employees) – 28% (222 responses)
Large Organizations (500+ employees) – 27% (211 responses)

Experience Working With Data

Question: How much experience do you have working with data?

Years of experience is important, because earlier-stage data professionals are still learning the ropes, and more experienced data professionals tend to have a preferred stack.

Novice (0-3 years of experience) – 39% (307 responses)
Intermediate (4-9 years of experience) – 37% (290 responses)
Expert (10+ years of experience) – 24% (185 responses)

Part 1 – Database Technologies

Before we dive in, let’s get clear on some of the larger trends we’ve been seeing outside of our survey:

Multi-model databases are gaining traction for their ability to handle various data models all in one place. Developers can pick and choose the right tools for the job, streamlining projects and simplifying data management.
Cloud services are also changing the game with Database as a Service (DBaaS) and serverless databases. DBaaS cuts down on the heavy lifting involved in database maintenance with a cloud-based solution that’s scalable and easy to manage. Serverless databases go a step further by automatically adjusting to your data needs, which means you’re only paying for what you use — smart and cost-effective.
On the tech front, databases are getting smarter with the integration of AI and machine learning. Now, they can do more than just store data; they can analyze it and make predictions, too, adding a whole new layer of intelligence to applications.
Security is also getting a boost with more robust features to keep data safe. And for those looking into cutting-edge tech, blockchain databases offer a peek into the future of secure and transparent data management.
Specialized databases for handling time-series and vector data are making it easier to manage the flood of data from IoT devices and AI applications.

These trends represent a shift towards databases that are not only more powerful but also more intuitive and aligned with the needs of modern applications.

Meanwhile, the vast amount of data that exists in the world continues to grow exponentially, with estimates suggesting that the global data sphere could reach 175 zettabytes by 2025. This data explosion is driving the need for more sophisticated and scalable database solutions – and database vendors are delivering.

In fact, the database management systems (DBMS) market is projected to reach $152.36 billion by 2030, growing at a CAGR of 11.56% from 2023 to 2030.

#1 – Relational databases are the go-to for the majority – 57% – of engineers.

developers-prefer-relational-databases

But not all databases share the love. Our survey showed that graph, time-series, search, vector, and NewSQL databases each garnered less than 1% usage out of all 780+ developers surveyed.

We also found that experience sways database preference.

Newbies (those with less than a year of working with data) tend to prefer MongoDB,
Mid-career data professionals (with 7-10 years of experience) typically go for Postgres,
And the seasoned veterans (those with a decade or more of experience) most often go for MariaDB.

But Why?

As you probably know, MongoDB is a leading NoSQL database, designed for working with large sets of distributed data. It’s popular because of its flexibility and scalability, and it’s particularly well-suited for applications requiring rapid development and handling of diverse types of data, such as big data applications and content management systems.

The open-source RDBMS, Postgres, offers advanced features like support for complex queries, foreign keys, triggers, views, transactional integrity, and multi-version concurrency control (MVCC). PostgreSQL is often chosen for applications requiring high levels of customization and scalability.

Given these characteristics, our findings make a lot of sense. Someone with less experience wouldn’t necessarily need or want advanced features, and newbies might want the simplicity and flexibility MongoDB offers.

A lot of beginner tutorials use MongoDB as it’s easier to work with if you only know JavaScript. For SQL databases, you need to know how to work with SQL, which has a way higher learning curve. Once you start building ‘real things,’ the pros and cons quickly become clear and the learning curve becomes worth it.

Rijk van Zanten

CTO at Directus

Location seems to make a difference, as well:

European engineers prefer MongoDB and MySQL,
while those in North America prefer MySQL and Postgres.

As we navigate through this sea of data, it’s clear that the choice of database technology is not just a matter of function but also of familiarity, experience, and even geography.

Part 2 – Database Hosting

Once you choose your database technology, you need to host it in an optimized, secure environment – that is if you want to avoid setting up expensive infrastructure in-house.

But there are a few different options. Before we reveal our survey results in this area, let’s level-set on a few of those options as per the most popular database hosting trends we see:

Cloud-based Hosting: Amazon Web Services (AWS), Microsoft Azure, Google Cloud Platform – all of these tech giants are in the cloud hosting game. Their managed database services (Amazon RDS, Azure SQL Database, and Google Cloud SQL) handle much of the database management workload, including backups, patching, and scalability, helping companies cut costs and boost scalability and flexibility.
Database as a Service (DBaaS): Closely related to cloud hosting (but different) is DBaaS, which offers an even more streamlined approach. With DBaaS, the cloud provider not only hosts the database but also manages the software and hardware infrastructure, which lets you focus on building apps. This approach is getting super popular: The market was valued at over $16 billion in 2022 and will grow at a CAGR of 16.5% until 2032.
Serverless hosting: Want to focus exclusively on writing and deploying code? Serverless hosting is your best friend. You don’t have to worry about scaling, maintenance or updates at all – the underlying infrastructure that runs the code is abstracted away, making it invisible to you, the developer. Plus you only pay for what you use. This type of hosting is perfect for applications with variable workloads and those designed around microservices.

#1 – Cloud is above all-else

So what did the survey reveal? No shocker, but devs prefer to cloud-host.

Where do developers host their databases?

What did catch our attention was the rise in DBaaS in small and mid-sized organizations – it was the second-highest preferred method for hosting for companies with less than 500 employees.

For organizations with 500+ employees, on-premises hosting took second place in the database hosting popularity contest.

Both mid-sized companies and larger enterprises said self-hosting was their second choice, right behind cloud hosting.

#2 – Pricing Considerations for Hosting

Is cost the primary factor in deciding how to host? Not exactly.

45% of respondents noted that cost was a factor to consider, but not a particularly critical one. When we cross-sectioned this data, however, there was something that caught our eye:

For respondents who said cost was the primary factor, the second-most popular database hosting route was for Database-as-a-Service providers (e.g. PlanetScale, FaunaDB, Xata)
For respondents who said they don’t handle cost considerations, on-prem hosting was the second-most popular.

When it does come to pricing, however, all respondents agreed that the two most important factors that should MOST influence pricing are:

Bandwidth usage
Machine resource utilization (e.g. RAM/CPU)

The factors that should LEAST influence pricing?

Storage space
Number of database records

Clearly, money is no object when it comes to speed and performance.

What factors should primarily influence hosting?

Part 3 – Scaling and Demand

A few nights ago, I caught “BlackBerry”— incredible movie, by the way— and it got me thinking. There’s this intense scene where everything just…crashes.

It was a real thing that happened back in 2011. It was a massive mess—emails stuck in limbo, people cut off from their lifelines. It was chaos born from success, a kind of ‘too much of a good thing’.

That scene, and the real-life event it mirrors, they’re perfect analogies for what we’re unpacking here: scalability and demand. Keeping your systems up and running smoothly as they grow is super critical. It’s not just about avoiding the bad PR of a crash; it’s about the trust of your users and the reputation of your tech.

In this section, we’re breaking down what we learned around real-world strategies and behind-the-scenes action that keep apps and services from becoming a cautionary tale like our friends at BlackBerry.

#1 – Org size directly reflects scaling method

Scalability - State of Data 2024 Pt 3

While not shocking, there is a direct correlation between the size of an organization and the way they handle scale for data infrastructure.

Small organizations with teams of 1-49 members often lean towards vertical scaling.

That’s essentially upgrading existing systems to be more powerful. It makes sense for smaller groups—they typically have less complexity to manage, and vertical scaling can be a simpler, more immediate solution.

Smaller orgs may lack the resources for more complex scaling strategies, or they might not need them yet. Vertical scaling can be less of a hassle and doesn’t require the intricate systems larger companies have to deal with.

Mid-sized organizations, those with 50-499 people, tend to prefer horizontal scaling.

This means they add more machines or nodes to their infrastructure to distribute the load.

For companies in this growth phase, horizontal scaling offers flexibility. They can incrementally add resources as they grow, without overhauling existing systems. It also allows them to start building more robust systems that can handle future expansion.

Enterprises with 500+ employees generally opt for load balancing.

This involves distributing workloads across multiple servers to optimize resource use, decrease response times, and prevent any one server from becoming overloaded.

Big companies face big demand, and load balancing allows for smoother handling of high traffic across global operations. It’s a more advanced approach that can maximize efficiency and reliability at scale, something large organizations can’t afford to compromise on.

#2 – Enterprises anticipate demand, everyone else automates (with caps)

Demand Spikes - State of Data 2024 Pt 3

Our survey revealed a surprising shift in how different sized organizations handle demand spikes. While 30% of developers from smaller to mid-sized businesses prefer to automate scaling within certain limits, the big players do things differently.

SMBs (1-499 employees) are setting up their systems to automatically scale resources with a cap. This gives them a safety net, ensuring they don’t overshoot their budget or resources when demand increases.

Automatic scaling with a cap means they’ve got a system that responds in real time to increased usage, but won’t exceed predefined limits. This approach keeps costs predictable, which is crucial for smaller players who need to balance growth with financial realities.

SMBs favor this approach because it’s a set-it-and-forget-it solution. It allows them to focus on development without constant monitoring of their infrastructure.

Enterprises (500+ employees) tend to anticipate demand and temporarily increase resources. They’ve got the resources and perhaps more importantly, the experience to predict when demand will spike and plan accordingly.

Anticipating demand means these organizations are analyzing trends, usage data, and other indicators to foresee when they’ll need more power and prepare their systems in advance.

Enterprises may opt for this proactive strategy to maintain more control over their environments and ensure a premium user experience, which is often more critical at scale.

Interestingly, those with a decade or more in the field often choose to over-provision resources. It’s a more conservative approach, maintaining more capacity than needed just in case.

This could stem from hard-earned lessons. Veterans of the data wars might prefer the assurance that comes with ready reserves, having seen the fallout from underestimating demand in the past.

It’s the tech equivalent of keeping a spare tire in the trunk—not essential until the day it definitely is.

#3 – Client-side is the caching method preferred across the board

Data Caching - State of Data 2024 Pt 3

Regardless of experience, org size, or region, one approach to caching has emerged as the universal go-to: client-side data caching.

It’s the favorite across companies of all sizes, with a whopping 58.7% adoption according to our survey.

Client-side caching takes the pressure off the server. By storing data locally on the client’s device, it can drastically reduce load times and server requests. It’s a game-changer for user experience.

With data readily available on the user’s own device, repeat access is lightning fast. This also means less data traffic hitting the servers, which can translate to cost savings on server resources.

Its popularity probably comes from the immediate performance boost it offers, along with its relative ease of implementation. It empowers users with quick access and takes a load off your infrastructure—it’s a win-win.

Behind client-side, server in-memory caching is also a strong contender, particularly for real-time applications where speed is key. Then there’s database-level caching and distributed/edge caching, highlighting a trend towards distributed architectures, and ensuring data is close to the user, no matter where they are.

The strategy is clear: minimize latency, maximize performance, and keep your user smiling, whether they’re clicking a button or loading a video. Caching is not just a performance enhancer; it’s an essential component in the modern developer’s toolkit for creating smooth and scalable applications.

#4 – Large enterprises and experienced developers prefer distributed computing

Distributed Compute - State of Data 2024 Pt3

Yes, the majority of developers don’t seem to care about distributed computing. But we did find an interesting trend: the larger the organization, the more pronounced the push towards distributed computing.

Similarly, seasoned developers with 10+ years under their belt show a tendency to either partially or fully embrace distributed computing to snip response times.

Distributed computing involves spreading tasks across multiple computers or servers, reducing the load on individual machines and speeding up processing times. For huge organizations, this can mean the difference between a sluggish application and a smooth user experience.

Large enterprises with 1000+ employees are the frontrunners in implementing partially or fully distributed compute systems.

These organizations likely have the infrastructure and, crucially, the scale of operations where the benefits of distributed computing—like high availability and disaster recovery—outweigh the complexities of setting it up.

As for the veteran developers, their leaning towards distributed systems could be informed by their years of navigating system failures and bottlenecks.

They’ve probably seen their fair share of server crashes and slow-downs. Their experience tells them that distributing the load can create a more resilient and responsive system, especially during unexpected surges in demand.

The trend is clear: As organizations grow and as developers accrue experience, there’s a distinct move towards leveraging the distributed nature of modern computing.

This approach not only improves response times but also paves the way for future expansions and technological advancements. It’s a sophisticated strategy—think of it as not putting all your eggs in one basket, but using a whole fleet of drones to deliver each egg safely and swiftly.

Part 4 – Security and Privacy

Recently, a major security incident involving XZ Utils, a popular data compression tool in Linux systems, made headlines.

The situation was critical — a backdoor had been inserted into versions 5.6.0 and 5.6.1 that could allow for remote code execution (RCE) by bypassing SSH authentication, potentially granting attackers unauthorized access to systems worldwide. This particular vulnerability was given a 10.0 severity rating, indicating its high criticality.

What’s particularly interesting about this incident is how the backdoor was programmed to complicate analysis, making it challenging to detect.

It targeted specific Linux distributions and SSH interactions and was designed to execute under very particular conditions, suggesting a sophisticated and calculated attack. The backdoor code was well hidden; it was absent in the Git distribution of the software and only included in the full download package, through a sequence of obfuscations.

The scariest part, though? It was only found by a random engineer at Microsoft who noticed failing login attempts were using a substantial amount of CPU…because they were taking 500ms longer than usual (that guy is an absolute legend, by the way.)

For anyone working in tech, this event is a critical reminder of the importance of vigilance in security and privacy in *whatever* you’re building.

This week, we’re diving into the findings from our State of Data survey all around security and privacy. 🔓

#1 – 92% of developers employ TLS/SSL

Encryption State of Data

Almost all developers across every sector use TLS/SSL to encrypt data in transit, making it by far the most popular method to secure data.

The second most common approach, particularly among enterprise organizations, is the use of VPNs (Virtual Private Networks), and the third is signing data with PGP (Pretty Good Privacy) keys.

TLS (Transport Layer Security) and SSL (Secure Sockets Layer) are foundational protocols for encrypting data as it moves across the internet, ensuring that any data intercepted during transmission remains unreadable.

The strong preference for TLS/SSL likely reflects its effectiveness and ease of integration, as these protocols have become standard security measures for protecting data online.

VPNs extend this security by creating private networks across public internet connections, which encapsulate and encrypt all data entering or leaving a system. VPNs are favored especially by larger organizations due to their ability to secure remote connections—a necessity in today’s mobile and flexible work environments.

PGP keys add another layer of security by allowing data to be encrypted and signed, ensuring both the privacy and authenticity of the information. While less common, PGP key usage highlights an ongoing commitment to not only protect data privacy but also verify the integrity and origin of the data, crucial in preventing tampering and fraud.

These methods together demonstrate a robust, layered approach to security, emphasizing not just defense against external threats but also the importance of data integrity and access controls.

#2 – Developers prefer RBAC over MFA for managing resource access.

Rbac State of Data

A significant 76.3% of organizations employ Role-Based Access Control (RBAC) as their principal method for managing resource access, with the use of Multi-Factor Authentication (MFA) also prevalent, particularly in larger organizations.

RBAC restricts system access to authorized users based on their roles within the organization, assigning specific permissions that align with their responsibilities. This method effectively limits access and reduces the risk of internal threats.

MFA enhances this security model by requiring users to provide multiple forms of verification before access is granted. This might include a combination of something the user knows (password), something the user has (security token), and something the user is (biometric data).

The preference for RBAC is largely due to its scalability and effectiveness in managing complex permissions across an organization.

As businesses grow and their structures become more complex, RBAC helps streamline access controls and ensures that proper security measures are maintained.

When we looked deeper into the data, we found that larger organizations are particularly inclined to integrate MFA as part of their security posture, as well, which makes sense: they have an increased risk from targeted cyber attacks.

#3 – GDPR compliance is universal for developers

Compliance Data Processing State of Data

Among surveyed organizations, 65% adhere to the General Data Protection Regulation (GDPR), highlighting its role as a leading standard for data protection compliance.

GDPR is a regulation that requires businesses to protect the personal data and privacy of EU citizens for transactions that occur within EU member states. It also regulates the exportation of personal data outside the EU.

The compliance with GDPR not only helps in avoiding hefty fines but also boosts consumer trust by ensuring data is handled securely and transparently. This regulation has set a benchmark for privacy laws worldwide, influencing how organizations manage and secure user data.

The high compliance rate with GDPR among organizations can be attributed to its stringent enforcement and the global nature of modern businesses. Although non-European organizations closely favor other compliance practices (in particular the CCPA and HIPAA) it is still a de facto standard for any business operating in the digital economy.

The adoption of GDPR practices often leads to improvements in overall data management strategies, prompting organizations to prioritize privacy not just to comply with legal requirements, but also to enhance their reputation and integrity.

Larger organizations, in particular, tend to be ahead in compliance as they possess the resources to implement complex data protection policies and handle the legal aspects of data privacy more effectively.

Part 5 – Data Handling and Loading

In app and site development, efficiency is king.

There’s always been a big emphasis in the tech community on keeping things lean: assets, databases, pages, and more. But in a recent + awesome article by Niki at tonsky.me, he did a deep dive into the web bloat that’s proliferating every site/app these days.

Turns out, apps that should be light and fast are getting bogged down by things like Javascript. Even simple blogs can lug around more code than a fully-fledged video game from the 90’s – and this was in 2015 🤯

How fast are we degrading

That brings us to this week’s focus, which is all about data handling and loading. The inspiration for this section was brought about around a few specific questions that devs should always ask during the build process:

When should app data load?
Should everything load upfront, or fetch as we go?

The choices you make during the build process have long-term consequences. Our goal is to give you a breakdown of how other devs approach data loading so you feel more confident in the decisions you make.

#1 – 85% of devs prefer to fetch data asynchronously

Data Loading

When it comes to data loading, the trend is clear: 85% of developers prefer to build apps that fetch data asynchronously—meaning they load data on the fly as users need it, rather than all upfront.

This is significant because async loading is key to making apps feel speedy. Instead of making users wait for everything to load at once, async lets them get started, then quietly pulls in the data as they go. It’s like getting to nibble on your appetizers while the chef finishes cooking the main course—your experience starts good and only gets better.

Async loading also reflects a shift towards more dynamic, user-centric app experiences. By prioritizing immediate engagement over complete pre-loading, developers can ensure users aren’t left staring at loading screens, which is crucial in a world where a delay of a few seconds can mean losing a user’s attention.

#2 – WebSockets are preferred for Data Streaming

Data Streaming Event Driven Architectures

When it comes to handling real-time data, WebSocket protocols are the the preferred route. nearly half of developers prefer to build apps with WebSockets, which makes it the go-to for keeping things current with constant page refreshes.

WebSockets allows for two-way communication between the user’s browser and the server. Think of it as live chat vs. sending emails back and forth. This is crucial for apps that rely on live data, like social media platforms or live stock trading.

Not lagging far behind (get it? 😛), polling for updates is still a pretty common approach, too. It’s a periodic approach that isn’t instantly updated, but it all really depends on the app’s needs – if it’s more static, a few minutes to fetch new data isn’t the end of the world.

#3 – Devs are increasingly interested in islands architecture and micro-frontends

Islands Architectures Micro Frontends

Our survey indicated an interested trend around growing curiosity concerning ‘islands architecture’ and micro-frontends, with 36% wanting to learn more, and 22% having already used it and planning to do so again.

The islands architecture concept, which is all about enhancing performance by only loading interactive components as needed, seems to be resonating with many developers. It’s especially handy for improving load times on complex sites.

Additionally, this is also a pretty interesting trend because it reflects a shift in web and app development towards more modular and decoupled architectures. Instead of a single, monolithic codebase, micro-frontends allow for smaller, more manageable pieces that teams can develop and deploy independently.

#4 – 58% of developers rely on global store

Global Store

The use of global state management tools like Vuex, Pinia, Context, Redux, and Flux is widespread, with 58% of respondents using them and keen to continue doing so.

This speaks volumes about how apps are keeping their state in check. With global stores, you’re basically centralizing all your app’s state into one manageable hub. This means less hassle when tracking how data changes across your app—everything’s in one place, and you can watch the whole picture easily.

Why do developers like this approach? It’s a no-brainer for apps that have lots of moving parts that need to talk to each other. Think of a busy airport with flights arriving, departing, and loads of real-time updates.

A global store keeps this complex flow of information organized and accessible, preventing a potential data traffic jam.

#5 – Developers are hooked on composables

Composables Hooks

A whopping 73% of developers have used composables or hooks and would use them again, which shows just how popular these tools have become for modern app development.

Composables and hooks let developers extract and reuse logic across components, which is kind of like having a set of handy tools you can take from project to project. This keeps the code DRY (Don’t Repeat Yourself) and testable.

Why do devs love them? They make components cleaner and more intuitive, and who doesn’t want that? In a world where we’re all about making things simpler and more efficient, hooks and composables are like a breath of fresh air, helping keep our codebases sleek and manageable.

Part 6 – APIs

Last month, there was a great article published in The Newstack all about how developers and API builders are intertwined. In summary – catering to developer-friendly tools and services is crucial for survival and growth in an increasingly saturated market.

The emphasis on developer experience as a “key competitive differentiator” is definitely a pivot within the tech industry. And you can see this by how all the hotshot start-ups – like Resend, PostHog, Airbyte, dub.co, and Hex – are challenging the established players in their respective spaces.

Prioritizing the developer experience highlights a broader trend of disruption, though. It’s not just about offering a service but how efficiently can developers integrate and use these services in their projects?

For the developer community, it’s an exciting time. We’re in an era where they can have a bigger impact on their companies and clients with more options and better tools. And it’s proven – improving the developer experience ACTUALLY DOES drive profitability! 😄

In this section, we’re going to dive into the results from our State of Data all around how developers are using and thinking about APIs.

#1 – 94% of developers prefer REST APIs

When working with APIs, what interface do you prefer

REST (Representational State Transfer) remains the gold standard for APIs, preferred by an impressive 93.7% of developers surveyed. It’s clear that when developers need reliable, established methods for building web services, REST is still king.

The dominance of REST is significant because it reflects its maturity, wide acceptance, and versatility in handling various types of calls and data formats. It’s like the Swiss Army knife of APIs—known for simplicity and effectiveness, making it the first choice for most developers when designing scalable and flexible web applications.

While GraphQL is picking up pace, especially for complex systems that benefit from its efficient data fetching capabilities, it’s still far behind REST in terms of sheer usage. This tells us that while new technologies are making inroads, the foundational technologies remain critical in the development ecosystem, likely due to their proven reliability and the vast amount of resources and community support available.

#2 – JWT and API Keys are the most popular API authentication methods

API Authentication Methods

With a substantial 73.8% of developers favoring JSON Web Tokens (JWT) for authentication, it’s clear that JWTs are the reigning choice for securing API interactions.

These tokens are valued for their ability to encode user credentials securely, offering a robust method to ensure data integrity and authenticity across requests.

The significance of JWT’s popularity lies in its compactness and self-contained nature, which simplify the authentication process across distributed systems. Their widespread adoption underscores the tech community’s prioritization of efficient, scalable security measures in application development.

Moreover, the fact that API keys closely follow at 71.5% indicates a strong preference for authentication mechanisms that support fine-grained access control and straightforward implementation. This trend reflects an overarching move towards enhancing security while maintaining ease of use in API interactions.

#3 – Inconsistent API data is the biggest challenge for developers

API Challenges

The biggest hurdle developers face when using APIs isn’t just about connecting them; it’s about making sense of the data they send back. Almost half of the respondents, around 49.7%, struggle with inconsistent data formats, making this the most significant challenge in API integration.

This is significant because inconsistent data formats can create bottlenecks in data processing, leading to errors, increased development time, and frustration. It’s crucial for systems to interpret and process data consistently to ensure seamless integration and functionality.

Understanding the structure and standardizing data formats can alleviate these issues, improving the reliability and efficiency of API interactions. Developers need tools and strategies that help normalize data across different sources, ensuring their applications run smoothly and data remains accurate and useful.

#4 – Half of developers manage API usage and costs with rate limits

API cost management

Nearly half of developers (49.6%) manage API usage and costs by implementing API keys with rate limits.

This approach allows for controlled access and prevents overuse, which can be crucial for maintaining budget constraints and system stability.

Why is this significant? Using API keys with rate limits not only helps in controlling costs but also in safeguarding APIs from potential abuse and overload. This method offers a straightforward way to manage access and track how APIs are being used, making it easier for companies to optimize resource allocation and enhance security measures.

By setting clear limits, businesses can avoid unexpected spikes in usage that could lead to performance issues or inflated costs.

#5 – 75% of developers combine data sources server-side

A significant 75.2% of developers handle data combination server-side, making it the preferred location for managing data from multiple sources.

Handling data on the server-side allows for complex operations like merging data streams or transforming data before it reaches the client, which can optimize performance and reduce bandwidth usage. This approach centralizes data processing tasks, which simplifies maintenance and enhances data security by consolidating data protection measures in one place.

The dominance of server-side processing underscores its efficiency and reliability for handling diverse and voluminous data sources, particularly in environments where performance and security are critical.

This method provides a scalable way to manage data integration and deliver refined data to clients, supporting smoother user experiences and more robust data governance.

Part 7 – The Future 🔮

Developers have an…interesting…relationship with new technologies and tools.

More often than not, they’re the ones tinkering with the latest advancements before these technologies hit the mainstream. Lately, however, there has been a noticeable shift within the developer community, especially as AI has dominated headlines.

The initial excitement that usually greets new tech is now mixed with a healthy dose of skepticism.

Today’s developers are more adept at distinguishing between tools that offer genuine solutions and those that are simply wrapped in slick marketing. This discernment is crucial in an era where the promise of “a new world” often outpaces reality.

In this section of our State of Data survey, we aimed to understand not just their preferences, but also the underlying motivations that drive their choices.

#1 – Developers find new trends through news and social media, then get hands-on to assess.

Assess Trends

Developers prefer real-time and easily accessible sources like news outlets, blogs, and social media platforms to quickly scan a vast array of information and stay informed without committing significant time or resources.

The significance of this finding lies in the understanding of how developers prefer to receive their information. With such a high percentage relying on digital media, it suggests that traditional methods or even formal training might be less effective for reaching this audience. Which makes sense, as most devs have intrinsic curiosity.

Nearly 61% of respondents also experiment with technologies personally, which suggests that while digital content is critical for awareness, hands-on experience is equally valued for making final adoption decisions.

It’s critical for tools and tech in the developer ecosystem to offer some sort of open source or free version for them to test and trial.

#2 – The biggest challenge for emerging technology is how it integrates into existing systems.

This statistic reveals a major hurdle that developers often encounter: integrating new technologies with existing systems.

It’s a common challenge, highlighting the complexities involved when new tools must work seamlessly with older, established frameworks. This integration is critical, as it can significantly affect the efficiency and stability of existing infrastructures.

The importance of this challenge can’t be understated. Integration issues can lead to significant delays in technology adoption and can escalate costs due to the need for customized solutions or additional training. This challenge not only impacts developers but also affects broader organizational goals, potentially hindering innovation and operational agility.

Equally notable is that nearly the same percentage of respondents (65%) identified the learning curve for new technologies as a challenge. This points to a dual hurdle: not only must new technologies integrate well, but they also must be accessible enough for teams to understand and use effectively.

Additionally, less than half of the respondents 46% indicated that assessing the impact on existing projects is a concern, suggesting that the initial integration and learning curve are perceived as more immediate barriers.

#3 – Other than AI, cloud technologies, edge computing, and low-code tools stand out as the hot trends.

So, what’s hot (other than AI, of course – which we’ll cover in the next section.)

There’s growing sentiment towards cloud-native solutions, edge computing, and the adoption of low-code platforms.

These technologies are shaping up to be the pillars of modern development practices, each addressing unique aspects of digital transformation.

Cloud technologies facilitate scalable and flexible infrastructure solutions, reducing overhead and enhancing operational agility. They also offer dynamic scaling and robust architectures that support a wide range of services, from databases to application deployment environments.
Edge computing focuses on processing data closer to the source, which is crucial for applications requiring low latency and real-time interactions. Our results showed that this is emerging as a critical component in scenarios where immediate data processing is essential, reducing dependencies on central data centers and minimizing delays.
Meanwhile, low-code platforms are lowering the barriers to application development, enabling a broader spectrum of people without traditional “developer” skillsets to build.

The convergence of these trends signifies a transformative period in tech, where accessibility, efficiency, and performance are all seeing simultaneous improvements. These trends are not isolated but are part of a broader movement towards more agile and user-friendly technology.

Expect a deeper integration of advanced tech into diverse areas of business and personal life, influencing everything from corporate IT strategies to individual creative endeavors.

Part 8 – AI, Data, and Development

Have you delved lately?

Delvin

What more appropriate way to wrap up our 2024 State of Data report by delving into AI for our final section. Developers are using AI in clever ways, but they know the risks. How are they navigating those risks while responsibly using it?

Let’s look at the data, spot the trends, and see where AI is leading us (or, hopefully, where we’re leading AI 😅)

What We Learned

#1 – 64% of developers are using AI for code generation

AI for Code Generation

AI tools are becoming essential for developers, with 64% of respondents using AI for code generation.

This high adoption rate is particularly notable across various organization sizes and experience levels. Whether in small startups or large enterprises, developers are leveraging AI to write code more efficiently.

It’s not just about generating code, though – AI helps automate repetitive tasks, allowing developers to focus on more complex and creative aspects of their work.

Despite the uncertainty around AI replacing developers (nice try Devin), this widespread use of AI actually highlights its role in boosting productivity for development processes. It also indicates a growing trust in AI tools, suggesting that developers are increasingly comfortable relying on these technologies for critical aspects of their coding tasks.

#2 47% of developers believe AI is too risky for database interactions

AI interacting with data

Nearly half of the respondents express concerns about AI handling database interactions.

The critical nature of database operations means that errors can have significant and irreversible impacts. Concerns include:

The possibility of AI making unintended changes
Crudely mishandling data
Introducing errors that could compromise data integrity

This caution underscores the need for safeguards and human oversight when deploying AI in database management situations.

It also suggests that while AI is embraced for its efficiency, there is still a significant level of skepticism about its reliability in high-stakes environments where precision is paramount.

Letting AI interact directly with a production database should be considered professional misconduct at this stage.

Enterprise Sr. Backend Developer

North America

Without oversight, there is a risk of inaccurate or harmful changes to critical data.

Mid-Level Data Engineer

Europe

AI can make mistakes that are irreversible in a database.

Entry-Level Fullstack Developer

APAC

#3 – 40% do not see AI as too risky for decision-making

AI for decision-making

Despite the resounding opposition to AI in a database management capacity, a significant portion of respondents do trust AI for data analysis in decision-making.

Many developers and engineers are confident in AI’s ability to analyze data accurately and provide valuable insights. These professionals recognize the potential of AI to enhance decision-making processes by offering data-driven recommendations and uncovering patterns that might not be immediately apparent to human analysts.

If AI is trained correctly, it will analyze data accurately, but there’s a high risk of errors if the training data is flawed.

Enterprise Sr. Data Scientist