This year I didn’t write my usual daily blog posts during the PASS Summit 2016 as I felt it to be bit too much work with the long days and bit of a jet lag with the 10 hour time difference. Instead I decided to write a post summarizing my experience of the event. Every year, when deciding on what pre-con sessions to take and what regular sessions to attend to I try to think of a theme. This year I decided to go with Big Data and Analytics, as that’s an area of Microsoft Data Platform I’m not terribly familiar with. It was also a good choice because with SQL Server 2016 we’re seeing a huge number of improvements on technologies involved with these topics and there were quite a few sessions regarding these.
From the Day 1 keynote
According to PASS president Adam Jorgensen PASS is a growing global organization that now has 250,000 members in 170 countries! We also get to see some of the financial figures, impressive amount of 6,7M USD is spend on community training. It means that roughly 73 % of the PASS income returns to community. Reflecting the changes around Microsoft Data Platform we also learned that PASS is re-branding itself, new and improved web-site will be up and running early 2017.
From Microsoft the speakers were Joseph Sirosh, Corporate Vice President of the Data Group at Microsoft and Rohan Kumar, Partner Director of Engineering at Microsoft. Joseph Sirosh introduced the term “ACID intelligence”, abbreviation of Algorithms, Cloud, Internet of Things and Data. The goal at Microsoft is to bring this intelligence to every piece of software, such as Office 365. ACID intelligence has three identifiable patterns that were Intelligent DB, Intelligent Lake and Deep Intelligence. A quick summary of these patterns.
Rohan Kumar told us about intelligence inside database rather than in application layer. SQL Server 2016 offers integrated R-language, Machine Learning and Hybrid Transactional Analytics Processing (HTAP) with In-memory OLTP and ColumnStore for superb performance, throughput, security and parallelism. We also saw a demo of querying multiple different data sources from within SQL Server by using PolyBase to extend this intelligence outside SQL Server. According to Microsoft they have seen 100x performance improvements in analytics and 30x improvement in OLTP performance in SQL Server 2016 with their customers, impressive numbers.
The Intelligent Lake allows processing of huge amounts of data (petabytes) for example in image and speech recognition. We also saw an example by Julie Koesmarno of sentiment analysis done on the novel War and Peace. It was rather interesting to see how you could not only easily see what characters were in which books but also what their general mood was (as in happy, sad) in each book. When it comes to performance we heard that Microsoft had done automated image tagging on 10 million images in just 10 minutes.
Azure offers an extremely sophisticated deep learning using some interesting hardware. Basically these are GPU VMs that boast NVIDIA Tesla GPUs. These are especially designed for the high performance computing market and they have price tags to match that, I think I saw some of these models in Amazon starting from 4000 USD. Combined with other cutting-edge technologies we now have completely new level of deep learning capability available in a public cloud. We actually saw pretty cool demo of these with eSmart Systems and their Connected Drone. Basically they used automated drones to monitor power lines, streaming video and still images for Azure where they are analyzed in real-time.
Other speakers and demos
There were also some additional speakers telling us how Microsoft solutions helped their businesses and we saw some impressive numbers on performance regarding Microsoft Big Data and Analytic solutions. There was also a Finnish company on a spotlight as Kalle Hiitola, Chief Technology Officer of Next Games told us about their use of Azure DocumentDB to power their Walking Dead based mobile game. They are pushing 120GB of new data each day to DocumentDB and perform 11,500 requests per second there. At the end of they keynote we also see how you could use R-language to predict the appearance of Pokemons, while I’m not a fan myself, it made an interesting demo. We also saw how an application running on smartphone and on a Pivothead smartglasses could use Microsoft Cognitive API to help blind person get a better understanding of his surroundings. The application could recognize objects, people, actions and moods from the pictures taken with either using the glasses or a phone and then give an audio description of it. Quite amazing bit of technology that, some years back, was considered science fiction.
Day 1 keynote is also available in PASS.tv for your viewing, I highly recommend watching it.
From the Day 2 keynote
David DeWitt was back! I was very disappointed last year when he announced leaving Microsoft to move to Boston and to work as a professor in MIT and that it was likely his last keynote at PASS Summit. He gave a very good presentation about the reasons to build Data Warehouse in the cloud and then introduction to three different cloud Data Warehouses, the Amazon Redshift, Snowflake and Azure Data Warehouse. While I was somewhat familiar with Redshift and Azure Data Warehouse, I wasn’t all that aware of the differences on how they were designed. The Snowflake was a completely new to me but it had an especially interesting design. It’s designed to be highly elastic with compute layer consisting of several “virtual” warehouses, which are sets of node and local storage for data caching.
To summarize the pros for cloud Data Warehousing:
- Fast to set up
- Pay only for what you use
- Flexibility to scale up and down as needed
- Cost efficiency
Day 2 keynote is also available in PASS.tv
Every year we get a number of announcements from Microsoft regarding their plans and new services or software, and this year was no different. This year we learned that:
- There’s a public preview of Azure Analysis Services available (Analysis Services PaaS offering).
- Microsoft Cognitive Toolkit beta is now available.
- We can get expanded free trial of Azure SQL Data Warehouse (free for one month).
- New tools called Data Migration Assistant and Database Experimentation Assistant are released.
I already had an opportunity to run the Data Migration Assistant against few databases we’re looking to put on SQL Server 2016, it’s a good addition to your toolkit when planning for server migrations.
I managed to participate in 2 full day pre-con sessions and 13 scheduled sessions over the week, that’s a lot of information to take in over such a short period. I did also manage to speak with number of vendors and met quite a few new people, which is also very important in an event like PASS Summit. Over the week I also realized how much I dislike the term “cloud”, and there was a lot of talk about it. My main issue with the term is that most people are happy to put their data in a data center that is run by a service provider. As long as you call it a data center, it’s fine, but the moment you call it a cloud people start freaking out.
And that’s because cloud sounds pretty damn vague and not something that’s build in some of the most advanced data centers in the world. That’s something to think about when you consider if the cloud services are secure enough for your needs. I’m quite confident that there’s not many service providers in the world that can match the security, performance and reliability designed on the data centers that host the “cloud”.