Contrary Research Rundown #121
The world runs on spreadsheets: how OneSchema is building the pipelines for “the new oil”, plus new memos on ElevenLabs, LangChain, and more
Research Rundown
As everyone on earth spends more and more time online, data volumes have exploded. Every click and swipe is a new data point. The average person spends nearly 6.5 hours online each day, which creates 15.9 terabytes of data daily. But the volume of business data puts that to shame. 41% of organizations are managing 500 petabytes of data at any given time; nearly 30,000x of the average person.
In 2006, British mathematician Clive Humby coined the phrase, “data is the new oil.” Global data volume has grown over 500x since then! But if data is the new oil, where are the pipelines? As efficient as you might think the world has gotten at data collection and transfer, you’d be surprised.
Extract, transform, and load (ETL) is a methodology for “combining data from multiple sources into a large, central repository.” While there are large established ETL platforms, some would joke that Microsoft Excel is the most common ETL tool in the world.
People can use Excel to manipulate comma-separated value (CSV) files; a text-file format that uses “commas to separate values, and newlines to separate records.” Most people think of CSVs as the crappier spreadsheet they accidentally used to download their bank statements or birthday contacts.
But in reality, most Fortune 500 companies are using hundreds of millions of spreadsheets each year to send data back and forth. Today, that’s the cutting edge in terms of pipelines for the new oil. Unsurprisingly, they’re quite brittle.
OneSchema is attempting to build a better pipeline. Whether companies are processing CSVs every now and then, or consistently as part of their business, OneSchema has built embeddable capabilities around every use case. The biggest problem that OneSchema is solving is the massive fringe cases in any given data. Whether it's dates, numbers, addresses, phone numbers, and on and on. For any particular data point, there are 20+ ways to store it.
OneSchema believes there is a place for a universal data normalization engine. Capture the edge cases across millions of spreadsheets and find the unifying language to cleanly manage the entire collection. From there, AI is perfectly suited to step in and optimize the manual processes that massive enterprises are spending tens of millions of dollars and thousands of working hours to manage. That’s the dream.
To learn more about OneSchema and how the company is building the pipelines for the new oil, check out our new memo on the company.
Secureframe is a platform that automates the compliance process for frameworks ranging from SOC 2 to ISO 27001, HIPAA, and PCI DSS. To learn more, read our full memo here and check out some open roles below:
Senior Product Manager - Remote
Senior Software Engineer (Full Stack) - Remote (Canada)
Maven Clinic is a virtual health clinic providing continuous care across fertility, pregnancy, parenting, pediatrics, mental health, and menopause. To learn more, read our full memo here and check out some open roles below:
Senior Software Engineer, iOS - New York, NY or Remote
Senior Data Analyst (Business Operations) - New York, NY or Remote
ElevenLabs offers an AI-powered voice generation platform that addresses the critical industry pain points of quality, trust, and reliability. To learn more, read our full memo here and check out some open roles below:
Forward Deployed Engineer - Remote (Poland, Bulgaria, United Kingdom, United States, Germany)
Full-Stack Growth Engineer - Remote (United Kingdom, United States, Poland)
LangChain is an open-source orchestration framework for the development of AI applications using LLMs. To learn more, read our full memo here and check out some open roles below:
Analytics Engineer - San Francisco, CA
Security Engineer and Compliance - San Francisco, CA
Check out some standout roles from this week.
Ramp | New York, Miami, San Francisco, Remote (US) - Senior Analytics Engineer, Senior Product Data Scientist (Financial Intelligence), Senior Software Engineer (Backend), Senior Product Designer
Clay | New York, NY - Machine Learning Engineer, Senior Site Reliability Engineer, Senior Software Engineer (AI), Go-To-Market Engineer
Pinecone | Remote (US) - Solutions Engineer, Director of Growth Marketing, Director of Product Marketing
Microsoft's AI business saw $13 billion in annual recurring revenue growing 175% year-over-year, while their Copilot AI usage saw a 10x increase in seats over the past 18 months and a 2x quarter-over-quarter increase in daily active users.
Despite having billions in the bank, OpenAI is in talks to raise $40 billion at a $340 billion valuation.
DeepSeek's cost-efficient AI models have sparked a "war of words" with OpenAI, which alleges that DeepSeek violated its terms of service by scraping outputs from OpenAI's proprietary models.
Apollo, a B2B sales intelligence and engagement platform, pivoted from a high-touch, sales-led model to a freemium, product-led growth (PLG) model, which allowed them to triple in size and surpass $100 million ARR in just 24 months.
OpenAI's collaboration with the U.S. National Labs will support their critical work in nuclear security, focused on reducing the risk of nuclear war and securing nuclear materials and weapons worldwide.
DeepSeek R1 is now available in the model catalog on Azure AI Foundry and GitHub, joining over 1,800 models, including frontier, open-source, industry-specific, and task-based AI models.
OpenAI claims DeepSeek used its proprietary models to train a rival AI system, violating OpenAI's terms of service.
Anthropic CEO Dario Amodei published an article where he argues export control policies on chips to China are even more existentially important now, despite the performance of DeepSeek's models, as they are crucial to preventing China from obtaining the millions of chips needed to develop powerful AI systems that could give them military and global dominance.
Waymo began testing its robotaxis on the freeways of Los Angeles.
OpenAI has launched ChatGPT Gov, a version of its ChatGPT AI assistant specifically designed for use by U.S. government agencies, with enhanced security features and the ability to handle sensitive government data.
Another OpenAI safety researcher, Steven Adler, has left the company, criticizing the "very risky gamble" of the global race toward AGI as labs rush to develop advanced AI without adequate safety measures.
Boom Supersonic's XB-1 demonstrator plane becomes the first civil aircraft to break the sound barrier, marking a significant milestone in the company's pursuit of reviving supersonic passenger travel.
Linda Yaccarino, X CEO, announced that X has partnered with Visa to launch the XMoney Account, which will allow secure and instant funding to the X Wallet via Visa Direct.
Helion, a fusion startup backed by Sam Altman, has raised $425 million in Series F funding to help build a fusion reactor for Microsoft that it plans to deliver by 2028.
The European Union is banning the Catholic prayer app Hallow.
DeepSeek caused a global rout in tech stocks, leading to over $1 trillion in losses for the "Magnificent 7" stocks heavily invested in AI, particularly Nvidia.
Jeffery Emanuel, a former hedge fund analyst, published The Short Case for Nvidia Stock where he argues that a combination of hardware and software threats could challenge Nvidia's ability to maintain its current high margins and market dominance in the AI computing space going forward.
China has achieved a record by sustaining a heat of 100 million degrees for 18 minutes in their artificial sun experiment.
DeepSeek has released Janus-Pro-7B, a new open-source multimodal AI model that outperforms both DALL-E 3 and Stable Diffusion in standard image generation benchmarks.
Aravind Srinivas, Perplexity’s CEO, announced the availability of DeepSeek R1 reasoning model on the Perplexity platform to support daily deep web research, with increased limits and upcoming updates.
Character AI, a platform that lets users engage in roleplay with AI chatbots, has filed a motion to dismiss a case brought against it by the parent of a teen who committed suicide after allegedly becoming hooked on the company’s technology.
Mira Murati, the former CTO of OpenAI, has poached around 10 researchers and engineers from competitors like OpenAI, Character AI, and Google DeepMind to join her new AI startup, which is focused on AGI.
At Contrary Research, we’ve built the best starting place to understand private tech companies. We can't do it alone, nor would we want to. We focus on bringing together a variety of different perspectives.
That's why applications are open for our Research Fellowship. In the past, we've worked with software engineers, product managers, investors, and more. If you're interested in researching and writing about tech companies, apply here!
OneSchema seems to tie well with TileDB [https://tiledb.com]