Befriending Dragons

Turn Scary Into Attainable


2 Comments

HDInsight Big Data Talks from #SQLPASS

SQL PASS Summit 2013 was another great data geek week! I chatted with many of you about Big Data, Hadoop, HDInsight, architecting solutions, SQL Server, data, BI, analytics, and general geekiness – great fun! This time around I delivered two talks on Hadoop and HDInsight – the slides from both are attached.

Zero to 60 with HDInsight takes you from an overview of Big Data and why it matters (zero) all the way through an end to end solution (60). We discussed how to create an HDInsight cluster with the Azure portal or PowerShell and talked through the architecture of the data and analysis behind the release of Halo 4. We talked about how you could use the same architectural pattern for many projects and walked through Hive and Pig script examples. We finished up with how to use Power Map (codename GeoFlow) over that data to gain new insights and improve the game experience for the end user.

The next session I co-presented with HDInsight PM Dipti Sangani: CAT: From Question to Insight with HDInsight and BI. We went deeper this time. Not only did we present an end to end story with how our own internal Windows Azure SQL Database team uses telemetry to improve your experience with SQL Server in Azure PaaS but we also went deeper with demos of Hive, Pig, and Oozie. We also gave another archetypical design scenario that will apply to many of your own scenarios and talked about how HDInsight fits with SQL Server and your other existing infrastructure. The deck covers your cloud and on-premises options for Hadoop on Windows including HDInsight Service, Hortonworks HDP for Windows, OneBox, and PDW with Polybase.

Please let me know if you have any questions from the talks or just general HDInsight questions!

PASSSummit2013BigData.zip


2 Comments

Self-Service BI Works!

When I talk to people about adding self-service BI to their company’s environment I generally get a list of reasons why it won’t work. Some things I commonly hear:

  • I can’t get anyone in IT or on the business side to even try it.
  • The business side doesn’t know how to use the technology.
  • This threatens my job.
  • I just don’t know where to start either politically/culturally or with the technology.
  • I have too many other things to do.
  • How can it possibly be secure, allow standardization, or result in quality data and decisions?
  • That’s not the way we do things.
  • I don’t really know what self-service BI means.

#PASSBAC 2013 Cindy and Eduardo 

So what is a forward thinking BI implementer to do? Well, Intel just went out and did it, blowing through the supposed obstacles. Eduardo Gamez of Intel’s Technology Manufacturing Engineering (TME) group interviewed business folks to find those who were motivated for change, found a great pilot project with committed employees, and drove the process forward. They put a “sandbox” environment up for the business to use and came up with a plan for monitoring the sandbox activity to find models and reports worth adding to their priority queue for enterprise BI projects. The business creates their own data models and their own reports for both high and low priority items. IT provides the infrastructure and training including products like Analysis Services, PowerPivot, Power View, SharePoint, Excel, SQL Server, and various data sources. The self-service models and reports are useful to the business – they reduce manual efforts, give them the reports they want much faster, and ultimately drive better, more agile business decisions. If a model isn’t quite right after the first try, they can quickly modify it. The same models and reports are useful to IT – they are very refined and complete requirements docs that shorten the time to higher quality enterprise models and reports, they free up IT resources to build a more robust infrastructure and allow IT to concentrate on projects that require specialized IT knowledge. Everyone wins with a shorter time to decision, higher quality decisions, and a significant impact on the bottom line.

Learn more about how Intel TME is implementing self-service BI:

Eduardo (eduardo.m.gamez@intel.com) and I (cgross@microsoft.com or @SQLCindy) are happy to talk to you about Self-Service BI – let us know what you need to know!

Digg This

How_Intel__Integrates_Self-Service_BI_with_IT_for_Better_Business_Results_[DAV-208-M].zip


1 Comment

Big Data – All Abuzz About Hive at #SQLPASS Summit 2012

Big Data – All Abuzz About Hive
Small Bites of Big Data

Cindy Gross, SQLCAT PM

I hope to see you at the #SQLPASS Summit 2012 this week!

There are many reasons people come to the PASS Summit – SQL friends, SQL family, networking, great content in 190 sessions, the SQL clinic, the product team, CSS, SQL CAT, MVPs, MCMs, SQL Community, and sometimes just to get away from the daily grind of work. All those are great reasons, and they all make for a great Summit.

This year I am focusing on BI and Big Data. For the Summit this year I will be introducing you to Hive. Hive is a great way to leverage your existing SQL skills and enter into the world of #BigData. Big Data is here in a big way. CIOs are pushing it, business analysts want to use it, and everyone wants to gain new insights that will help their business grow and thrive. Don’t let Big Data pass you by, leaving you wondering how you missed out. Hive is fun and interesting, and for SQL Pros it looks very familiar. Come to my talk and come by the SQL Clinic to ask questions throughout the week. Please come up and introduce yourself at any time, I love to meet new SQL Peeps!

BIA-305-A SQLCAT: Big Data – All Abuzz About Hive
Wednesday 1015am   |   Cindy Gross, Dipti Sangani, Ed Katibah

Got a bee in your bonnet about simplifying access to Hadoop data? Want to cross-pollinate your existing SQL skills into the world of Big Data? Join this session to see how to become the Queen Bee of your Hadoop world with Hive and gain Business Intelligence insights with HiveQL filters and joins of HDFS datasets. We’ll navigate through the honeycomb to see how HiveQL generates MapReduce code and outputs files to answer your questions about your Big Data.

After this session, you’ll be able to democratize access to Big Data using familiar tools such as Excel and a SQL-like language without having to write MapReduce jobs. You’ll also understand Hive basics, uses, strengths, and limitations and be able to determine if/when to use Hive in combination with Hadoop.

And there’s more! Here is a sampling of blog posts about this year’s summit:

After the Summit, you can still stay involved. Follow some SQL Peeps on Twitter, sign up for a few SQL blog feeds, and buy a book or two. Attend local events like SQL Saturdays and User Group meetings. Reach out to your fellow SQL-ites and stay in touch with those you meet. Keep SQL fun and interesting and share what you learn!

See you at the #SQLPASS Summit 2012!

I hope you’ve enjoyed this small bite of big data! Look for more blog posts soon on the samples and other activities.

Note: the CTP and TAP programs are available for a limited time. Details of the usage and the availability of the CTP may change rapidly.

PASS2012BIA305AAllAbuzzAboutHive.pptx


Leave a comment

What’s all the Buzz about Hadoop and Hive?

What’s all the Buzz about Hadoop and Hive?

Why it Matters for SQL Server Peeps

Small Bites of Big Data

Cindy Gross, SQLCAT PM

On September 20, 2012 we have another 24 Hours of PASS event! This PASS Summit Preview will give you a taste of what is coming at this year’s PASS Summit. There are 190+ technical sessions this year at the Summit, and you’ll get a preview of 24 of them at the #24HOP event tomorrow! Come hear about some of the hottest topics and features in the SQL Server, BI, and data world.

One of the big buzzwords over the last year or so is Hadoop, and the most familiar part of Big Data and Hadoop to most SQL Server professionals is Hive. Do you wonder what it is and why you should jump in now while it’s still new and growing by leaps and bounds? I have just the #24HOP session for you!

#24HOP: What’s all the Buzz about Hadoop and Hive? – Why it Matters for SQL Server Peeps

Everyone is buzzing about Hive and trumpeting the virtues of Hadoop. But what does it mean? Why does it matter to a SQL Server and/or BI professional? Come get a taste of the Hive honey and see why this new technology is worth buzzing about!

During this talk I’ll give a very high level overview of Big Data, Hadoop, and Hive (for the nitty gritty details come to the Summit!). I’ll also go through why Hive matters in the SQL Server world, what a SQL Server Peep might end up doing in a Hive world, and why it is important for you as a SQL Server Peep to jump in and get your feet wet with Hive now.

Once you’ve heard this #24HOP talk I hope you’ll be fired up about Hive and more anxious than ever to sign up for the  PASS Summit to learn even more about Hadoop, Hive, Big Data, and all things BI and SQL Server. I’ll be co-presenting at the Summit with SQL Server PM Dipti Sangani:

SQLCAT: Big Data – All Abuzz About Hive [BIA-305-A]
Session Category: Regular Session (75 minutes)
Session Track: BI Platform Architecture, Development & Administration
Speaker(s): Cindy Gross, Dipti Sangani

Got a bee in your bonnet about simplifying access to Hadoop data? Want to cross-pollinate your existing SQL skills into the world of Big Data? Join this session to see how to become the Queen Bee of your Hadoop world with Hive and gain Business Intelligence insights with HiveQL filters and joins of HDFS datasets. We’ll navigate through the honeycomb to see how HiveQL generates MapReduce code and outputs files to answer your questions about your Big Data.

After this session, you’ll be able to democratize access to Big Data using familiar tools such as Excel and a SQL-like language without having to write MapReduce jobs. You’ll also understand Hive basics, uses, strengths, and limitations and be able to determine if/when to use Hive in combination with Hadoop.

I hope you’ve enjoyed this small bite of big data! Look for more blog posts soon on the samples and other activities.

Note: the CTP and TAP programs are available for a limited time. Details of the usage and the availability of the CTP may change rapidly.

UPDATE 9/28/12 – demo steps to load the AdventureWorks data to Hive are available at http://blogs.msdn.com/b/cindygross/archive/2012/05/07/load-data-from-the-azure-datamarket-to-hadoop-on-azure-small-bites-of-big-data.aspx.

24HOPFall2012HiveBuzz.zip


2 Comments

Witty WIT Speakers needed for 24 Hours of PASS – March 2011

SQL PASS is planning another 24 Hours of PASS (24HOP). It will be March 15-16, 2011. Since that is during Women’s History month all the speakers will be women. So if you’re a Woman in Tech (WIT) and know something about SQL or how Windows or storage affects SQL this is a great opportunity!

 

Email your abstract to 24hours@sqlpass.org by Jan 14, 2011. No more than 250 words for the abstract itself plus 125 words for your bio.

 

So what is 24HOP? The format has changed a bit over time, but currently it consists of 12 hours of back to back one hour sessions about SQL Server held two days in a row. Each session is delivered by a different speaker and the topics vary quite a bit but all relate to understanding SQL Server. This is all done over Live Meeting so you can attend from your desk, and the sessions are made available for viewing later if you can’t attend them all live. Ideally the sessions will have time at the end for questions, so you don’t need a full hour of slides and/or demos.

 

So why should you submit an abstract? Here are a few good reasons: Community involvement, including speaking, is one of the things that can help qualify you for nomination for SQL MVP. A speaking engagement is a great item to put on your review or resume. And by speaking at smaller events like 24HOP it becomes easier to get your abstracts selected at bigger, more competitive events such as SQL PASS.

 

I look forward to seeing all of you speaking at 24HOP, your local user groups, and SQL PASS!

 

More SQL Women and WIT discussions:

·         Why So Few? http://www.aauw.org/learn/research/whysofew.cfm

·         Data Chicks, We Need You! Call for Speakers http://blog.infoadvisors.com/index.php/2010/12/09/data-chicks-we-need-you-call-for-speakers-24hop/

·         Trolling the #24HoP http://blog.infoadvisors.com/index.php/2010/12/10/trolling-the-24hop

·         24HOP Needs Women http://sqlserverpedia.com/blog/sql-server-bloggers/24hop-needs-women/

·         I Need More Women http://thomaslarock.com/2010/12/i-need-more-women

·         Women in Technology – Why Does it Matter? http://blogs.msdn.com/b/cindygross/archive/2010/12/09/women-in-technology-why-does-it-matter.aspx

·         The Next 24 Hours of PASS Event: Announcement and Call for Presenters  http://littlekendra.com/2010/12/02/the-next-24-hours-of-pass-event-announcement-and-call-for-presenters/

·         The Next 24 Hours of PASS http://www.jasonstrate.com/2010/12/the-next-24-hours-of-pass/