Simple and Distributed Machine Learning
We would like to acknowledge the developers and contributors, both internal and external who helped create this version of SynapseML.
We would like to acknowledge the developers and contributors, both internal and external who helped create this version of SynapseML.\n
This list of changes was auto generated.
We would like to acknowledge the developers and contributors, both internal and external who helped create this version of SynapseML.\n
This list of changes was auto generated.
Distributed Langchain | Vector Search Indices | Semantic Link |
Deploy your LLM apps on millions of documents | Quickly create semantic and multi-modal search engines | Work with PowerBI datasets natively from Microsoft Fabric |
View Notebook | Try an Example | Learn More |
Keyless AI Services | Orthogonal Forests |
Use built-in AI services without keys in Microsoft Fabric | Discover and measure heterogeneous causal effects |
Learn More | Try an Example |
LangchainTransformer
for orchestrating LLMs at scale (#1925, #2036)OpenAIChatCompletion
transformer (#1887)DistributionBalanceMeasures
to detect data drift (#1885)getPValue
(#1863).setFitIntercept
should default to true (#1876)aadToken
& url
on Fabric (#1918)[-1]
(#1906)getPValue
to python API of DoubleML (#1909)Foo
type from Python codegen (#1867)cognitive.*
APIs to services.*
(#2117)We are excited to highlight the contributions of the following SynapseML contributors:
Aydan Aksoylar | Sheryl Zhao | Markus Cozowicz |
Aydan is a Senior Applied AI Engineer and a first-time contributor to SynapseML. Aydan recently joined Azure Data but quickly led the efforts to add the new integration with Azure Cognitive Search's Vector Indices. This feature allows users to quickly create flexible semantic search engines powered by rich models like GPT4. Aydan went above and beyond on thie project and also contributed a Document Question and Answering with PDFs quickstart to showcase how to use these new features. | Sheryl is Principal Applied Scientist on the SynapseML team and a first-time contributor to SynapseML. Sheryl worked hard to devise an elegant connection between the LangChain and SynapseML to enable deploying chains on large datasets. She also designed and built a lovely quickstart to showcase how to build a distributed axiv reader with only a few lines of code. | Markus is a Principal Applied Scientist on the SynapseML team and a SynapseML veteran developer. Markus has contributed algorithms running the gamut from reinforcement learning and LLMs to anomaly detectors. This release, Markus contributed an ambitious and full-featured integration between SparkSQL and PowerBI data models. This allows users to explore their existing PowerBI datasets and measures with the full generality of PySpark or (Scala) Spark. This dramatically expands the automation possibilities within Microsoft Fabric. Markus never ceases to out-do his prior contributions and we are excited to see what he has in store next. |
Amir Jafari | Aadharsh Kannan | Brendan Walsh |
Amir Jafari is Senior Product Manager on the SynapseML team and has recently taken over the role of the official SynapseML PM. Amir's passion to advance the library was instrumental in driving us to v1.0. He is fiercely productive and has a knack for simplifying and improving the SynapseML user experience. Additionally, Amir isn’t afraid to roll up his sleeves and contribute notebooks and blogs. He drove several efforts to create new quickstarts and documentation for a variety of SynapseML features. | Aadharsh is a Vice President and Head of Economics and Data Science at Western Digital. Aadharsh is also a new SynapseML contributor whose first contribution significantly generalized our causal inference stack to support fast estimation of heterogeneous causal treatment effects with Orthogonal Random Forests. This was a nontrivial and mathematically intensive contribution, and we are grateful for Aadharsh's expertise and persistence in getting this through our build system. | Brendan is a Senior Engineer on the SynapseML and a talented developer. Brendan's contributions range from core improvements to the SynapseML build and documentation generation system, to spearheading customer engagements and onboarding AI services. Most recently, Brendan used SynapseML to create and donate thousands of audiobooks to the open source in partnership with Project Gutenberg. This effort was considered one of TIME's top 200 inventions of 2023. You can learn more about Brendan’s awesome technical philanthropy efforts at https://aka.ms/audiobook. |
Jessica Wang | Serena Ruan | Cruise Li |
Jessica is Software Engineer who recently joined the SynapseML team. Already, Jessica has grown into the role of the SynapseML benevolent “doc”tator. This release Jessica has worked hard to ensure that the SynapseML notebooks work across a wide variety of Spark platforms and are easy and simple to get started with. This work requires knowledge of the entire library’s surface area, and we are thankful Jessica has worked so hard to learn this breadth of content. Furthermore, Jessica was also instrumental in building our Azure Doc auto-generation system to ensure all docs are tested as part of our CI build. | Serena is a Software Engineer at Databricks, a MLFlow maintainer, and a prolific SynapseML contributor. Serena's impact can be felt throughout almost every aspect of the library, and she is personally responsible for the new Form Recognizer V3 update, new streaming anomaly detection APIs, distributed deep network training, and many more features. Additionally, Serena laid the foundations of keyless authentication on Fabric, and pioneered our integration with MLFlow. | Cruise is a Software Engineer II on the SynapseML team in Bejing. Cruise has been instrumental in building and testing the keyless Azure AI services on Microsoft Fabric. With this contribution, Fabric users can configure their workspaces to use OpenAI, Langchain, and a variety of other AI services without the hassle of managing keys or authentication. Cruise has also worked hard to ensure AAD authentication works with Azure AI services and has helped the effort to standardize logging and telemetry across SynapseML and its sister projects. |
We would like to acknowledge the developers and contributors, both internal and external, who helped create this version of SynapseML
Markus Weimer @markusweimer, Eric Dettinger @sandshadow, Scott Votaw @svotaw, Mark Niehaus @niehaus59, Aydan Aksoylar @aydan-at-microsoft, Sheryl Zhao @sherylZhaoCode, Markus Cozowicz @eisber, Brendan Walsh @BrendanWalsh, Jessica Wang @JessicaXYWang, Tom Finley @TomFinley, Sailesh Baidya @saileshbaidya, Keerthi Yanda @KeerthiYandaOS, Kyle Rush @k-rush, Aadharsh Kannan @AKannanMSFT, Serena Ruan @serena-ruan, Cruise Li @mslhrotk @lhrotk, Jason Wang @memoryz, Haizhou (Dylan) Wang @dylanw-oss, Sarah Shy @sarahshy, Kashyap Patel @ms-kashyap, Puneet Pruthi @ppruthi, Ilya Matiach @imatiach-msft, Amir Jafari @amhjf, Nellie Gustafsson, Bogdan Crivat, Justyna Lucznik @juluczni, Richard Wydrowski @richwyd, Tania Arya @taniaarya, Adithya Mukund @adithyamukund, Roman Batoukov @RomanBat, Alexandra Savelieva @alsavelv, Jessica Wolk @msplants Luis França @luisffranca Paul Koch @paulbkoch Rich Caruana, Avrilia Floratou, Martha Laguna @martthalch @marthalc, Jeff Zheng, Sciong Yang, Peixian Gong, Ruixin Xu, Chris Hoder, Derek Legenzoff, Misha Desai, Eren Orbey, Beverly Kodhek, Louise Han @jr-MS, Raj Rikhy, Brice Chung, Marcos Campos, Mike Estee, Kim Manis, Mitrabhanu Mohanty, Anand Raman, Sudarshan Raghunathan @drdarshan, William T. Freeman, John Moyer, Vidip Acharya, Ashit Gosalia, Miguel Fierro @miguelgfierro, Ismaël Mejía @iemejia, Kartavya Neema @kartavyaneema, Daniel Ciborowski @dciborow, Mark Tabladillo @marktab Guilherme Beltramini @gcbeltramini Akshaya Annavajhala (AK), James Verbus @jverbus, Mopé Akande @msakande, Frank Solomon @fbsolo-ms1, ONNX Team, Azure Global, Vowpal Wabbit Team, LightGBM Team, MSFT Garage Team, MSR Outreach Team, Speech SDK Team, MLflow Team, Azure Docs Team