0

SharePoint Fast Search Concepts and Terminology – Part 2/4

This blog post is Part 2/4 of blog post series that will help you to get familiar with few concepts and terminologies referred in any search technology. As this blog series is more focused towards ‘Fast Search for SharePoint’ you may see jargon relevant to this.

Please check other related posts  Part 1| Part 2 | Part 3 | Part 4

The following graphic gives ten thousand foot view of what I am trying to capture and explain in this blog post series.

Fast Search Terminology and Concepts

5. Recall and Precision

The total number of results in the result set for a query. You have to find a fair balance between Recall and Precision. The results set should not be too large and should avoid noise as much as possible. If the Recall is too large or too small it will hamper Precision.

There are various techniques to improve Recall such as Synonyms, Stemming etc.

Synonyms:

This is pretty common technique and a very obvious one. For instance, if you search for “happy” the search would also query for “joy”, “elated”, “merry” etc.

Stemming:

The use of Stemming is to get to the root form of a word. Stemming compares the root forms of the search terms to the documents in its content sources. For example, if the user enters “viewer” as the query, the search engine searches for “view” and returns all documents with view, viewer, viewing, preview, review etc

Recall and Precision Balance

Recall and Precision Balance

6.  Corpus

Corpus is Latin term for body. In the Search world, it refers to the scope of all the content sources the crawler would crawl and indexes. Following gives an example of what Corpus can include.

Corpus in Search

Corpus in Search

 

0

SharePoint Fast Search Concepts and Terminology – Part 1/4

This blog post is Part 1/4 of blog post series that will help you to get familiar with few concepts and terminologies referred in any search technology. As this blog series is more focused towards ‘Fast Search for SharePoint’ you may see jargon relevant to this.

Please check other related posts  Part 1| Part 2 | Part 3 | Part 4

Before we deep dive into these concepts, lets try to capture the overall process flow for any search engine. Typically content for any organization is stored in databases or documents on a file system. These documents can be of any type word, excel, images, videos, pdfs power-points etc. The primary job of any search technology is to crawl, index and surface results.

  1. Crawl:

    1. Once you have identified what content you want to crawl you will make search engine aware of where these are located and will grant appropriate permissions to crawl.  A crawl is basically collecting data and primarily metadata about the content.
  2. Index:

    1. Indexing is similar to how indexes work with books, they are just pointers to the actual location of the content. Typically indexes are physical files that are added to the file system and are output of a crawl.
  3. Search:

    1. Once the crawling and Indexing are complete its time to search these indexes, since these indexes are present on the file system querying these is lot faster.

The following graphic gives ten thousand foot view of what I am trying to capture and explain in this blog post series.

Fast Search Terminology and Concepts

Fast Search Terminology and Concepts

Lets get started:

1. Content Processing:

This is the nexus of any search engine, this defines what data-sources the search engine should crawl and the quality of the content itself for crawling. This is all about enriching the content even before it is being crawled.  Some of the tasks include

  1. Making sure that the search engine doesn’t crawl lot of noise consequently impacting the recall
  2. Detecting the language and applying rules
  3. Extracting meta data etc. and the list goes on.

2. Query Processing:

This kicks in after user performs the search, the search engine analyzes what the user is actually requesting and will accept additional query parameters if needed. This also matches result items in the search index  and returning search results to the user.

3. Relevancy:

This is the measure of how accurate/precise the search results are. There are various factors that determine how good the relevancy is, for instance the more the user find the intended results in the top search the better the relevancy is.  There are various techniques that can improve relevancy which will be discussed more in other part of this blog post.

4. Query Expansion: Best Bets, Synonyms, Lemmatization

Query expansion is a technique to improve Recall. The user may search for a term, search engines would not only search for specific term but also other relevant terms. This section explains on various techniques on how Query expansion can be accomplished.

4.1 Lemmatization

Lemma is a greek word which mean assumption or the canonical form of the word. For instance if the user searches for ship it would search shipped, ships etc. Not to be confused with Stemming where only the end of the word changes where it substitutes only the ending. For instance, Stemming would search for See, Seen, Seeing but no Saw. Where as Lemmatization would search for ‘Saw’ as well.

4.2 Synonyms

This is one of the most popular Recall technique, where Search engine would return results not only to the search terms but also for its synonyms. For instance if the user searches for ‘Joy’, it may include results for ‘Merry’, ‘Happy’, ‘Elated’, ‘Celebration’ etc.

4.3 Best Bets/Visual Best Bets

Best Bets are usually links displayed on the top of the search results pointing to different pages or content. These links are manually curated by administrator to display for a particular search term. Visual Best Bet similar to Best Bet except that an additional image is provided along with link and description.

 

13

Errors were encountered during the configuration of the Search Service Application.

I have encountered the following exception after configuring Domain Controller on my stand alone SharePoint 2013 Azure VM.

The actual exceptions is “Windows NT user or group ‘SDsdakoju’ not found. Check the name again. at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction) at System.Data.SqlClient.TdsParser.”

Root Cause:
Since I did not start off with a Domain Controller, all my SQL accounts were in “machinenameusername” format, so the SQL server Logins did not get the updated username format “domainnameusername” 

Following is the screen capture of the exception

Windows NT user or group not found

Windows NT user or group not found

Resolution:

I have modified the user accounts to “domainname/username” . I set my domain as ‘SD’. Please see below Before-After screen captures. Everything else is self-explanatory.

Errors were encountered during the configuration of the Search Service Application

Screen capture showing before and after changes to the SQL Logins

Hope this helps to resolve your issue.

0

The SDDL string contains an invalid sid or a sid that cannot be translated

I have encountered this exception while I was trying to create ‘Search Service Application‘ in SharePoint 2013.

Surprisingly many encountered this particular exception  at completely different instances /scenarios.

For instance, some faced while running through SharePoint Configuration Wizard steps and some while creating Service Applications. Based on my understanding this is commonly encountered on STANDALONE instances, primarily set up for development. You might NOT face this issue at an enterprise level SharePoint farm, you would get to know why, by the end of this blog.

Following are two common instances along with resolution.

  1. Running the SharePoint Configuration Wizard[1].

    I have set up my whole SharePoint Farm via PowerShell scripts and did not encounter this. But following solution from Microsoft Technet, worked perfect for many. Hope this resolves your issue.

SharePoint 2013: The SDDL string contains an invalid sid or a sid that cannot be translated

SharePoint 2013: The SDDL string contains an invalid sid or a sid that cannot be translated

 

2. Creating Service Applications

Following is screen capture of the error message

SharePoint 2013: The SDDL string contains an invalid sid or a sid that cannot be translated

Resolution: Set up your Domain Controller

When I spun up SharePoint 2013 on my Azure VM, I did not configure my Domain Controller which appears to be prerequisite for certain functionality to work with in SharePoint 2013.

If you closely look at the portion of the exception in the above screen capture, you will find “Invalid sid or a sid that cannot be translated“.  These issues are encountered when Domain Controller is not configured correctly.

A security identifier (SID) is a unique value of variable length used to identify a trustee. Each account has a unique SID issued by an authority, such as a Windows domain controller, and stored in a security database. Each time a user logs on, the system retrieves the SID for that user from the database and places it in the access token for that user. The system uses the SID in the access token to identify the user in all subsequent interactions with Windows security. [2]

Finally after exhaustive research and understanding, I resolved my issue by creating the domain controller. Please following instructions at Windows Server 2012: Set Up your First Domain Controller (step-by-step) to set up your domain controller.

After this is complete my Central Admin and all my web applications were working fine.

But when I started continuing to create my Search Service Application, I got hit by another minor error. Since I did not start off with Domain Controller, all my accounts were in “MachineNameusername” format, so the SQL server still got the old username, so modified it to “DomainNameusername” and everything worked seamlessly.

Windows NT user or group not found

Windows NT user or group not found

I was so glad to see this working, I was working on this issue stubbornly, sacrificing super bowl 2015. At least it payed off!

References:

[1] SharePoint 2013: The SDDL string contains an invalid sid or a sid that cannot be translated

[2] Security Identifiers

0

Error: Method not found: ‘System.Web.WebPages.IDisplayMode System.Web.Mvc.ControllerContext.get_DisplayMode()’.

I have received the following error message while using Sitecore MVC Development. Posted the resolution as it might help my fellow developers when facing the same issue.

Method not found: ‘System.Web.WebPages.IDisplayMode System.Web.Mvc.ControllerContext.get_DisplayMode()’.

Resolution: I have muted/deleted the following dll and everything started working normal ‘Microsoft.Web.Mvc.FixedDisplayModes.dll’

0

Building you first cloud hosted app on Office 365 – Using Napa

I was eavesdropping on one of the so called ‘Technical Elevator Conversations’

I heard “NAPA”.

Did some Googling, sorry some Binging as well and started to assimilate some of the information available on MSDN and other blogs. Fell in love instantly! What’s fascinating about this is, you could develop & deploy  from the scratch a complete ‘Cloud hosted App’ via browser and mere JavaScript. You would be surprised that it took less than ten minutes for the whole thing.

I thought to keep it simple and started of creating a simple temperature conversion tool, which run on simple JavaScript and some lines of  HTML. After all, the whole idea is to use NAPA and create an app. So I quickly borrowed few lines of code from W3C schools and used it in my app.

Note: There are so many blogs and MSDN articles out there explaining Napa in great detail. I just made an attempt to keep it as simple as possible, just to give you a glimpse of what Napa is all about and get some first introduction to it.

Following is what I did

  1. Create or use existing SharePoint 2013 site on Office 365 portal
    Note: If you do not have an Office 365 account you could easily activate one with you MSDN subscription. If you do not have an MSDN subscription, you may sign up for Office 365 for home to start and exploring some of the features.
  2. Once you have your SharePoint 2013 site up and running you would navigate to Site Contents and click ‘add an app’ as highlighted below
    Snap1
  3. Go to SharePoint Store by click the link as highlighted below.Snap2
  4. Search for ‘Napa’ and you should find an app ‘Napa Office 365 Development Tools’. Go ahead and install it.Snap3
  5. Once the install is finished you should find in your ‘Site Contents’. You may click on it to start using ‘Napa’ or by clicking ‘Build an app’ option available on the home screen. Please see below, for both of these options.
    Option 1:
    Snap4 Option 2:Snap5
  6. Kick off the app creating by clicking ‘Add New Project’
    Snap6
  7. You will be prompted with options to create a different kind of app, choose ‘App for SharePoint’ and give your app a name.Snap7
  8. Once you completed the above step, you are now officially on ‘Napa’ and can start coding. As I mentioned in the beginning, I borrowed the following code from W3C schools which helps with temperature conversions from Celsius to Fahrenheit and vice-versa.Snap8
  9. That is you are almost done, click Publish icon as highlighted in the screen capture. This should prepare the package, deploy and launch the app.
    Snap9
  10. This is how the ‘Temperature Converter’ app looks like. Nothing fancy two text boxes and few lines of JavaScript.Snap10
  11. This app should now show up in your Site Contents, please see highlighted.Snap11
  12. Congratulate your self for building the first cloud hosted app and get a brew.
0

Set up the development environment for SharePoint 2013 on Azure

Setting up development environment for SharePoint is easy, if you have right hardware and software and basic understanding of the configuration.

For SharePoint 2007, it was simple, the hardware requirements were basic and I managed to get it working with a laptop with simple configuration. SharePoint 2013 now supports ‘App’ development and other new features that require a lot of additional RAM and a lot more hardware. So I decided NOT TO upgrade my hardware, instead make use of Azure privileges that come with my MSDN subscription.

This blog post will guide you on how to set up your developer instance on Azure and unfortunately, does not cover configuring your developer VM step by step. If you are interested to build your VM from the scratch, please follow SharePoint 2013 Virtual Machine Set up Guide (Version 3.0) from Critical Path Training.

Note: This post assumes that you already have an active subscription with ‘Microsoft Azure’. If you don’t have one, you can sign up for a trial account or if you have an active ‘MSDN Subscription’ you can enjoy a recurring $150 credit per month that ships with your subscription. This is a great way to start and I am using my subscription for all my Azure development. You may visit  Microsoft Azure Free Trial: Try Azure | Azure Free Trial

Note: All the screen captures of Azure portal are valid only during the time of writing this blog post i.e. January 2015.  Microsoft is very aggressive not only adding new features to the portal but enhancing its user experience.

Step 1: Navigate to Windows Azure Management Portal

Step 2: Click the    Azure New Icon    icon at the bottom of the screen.

Step 3: You should see the following screen with an option to add a Virtual Machine from the Gallery

Select VM from Azure Gallery

Select VM from Azure Gallery

Step 4:  Choose the ‘Image’ of your choice, following is what I have chosen in the portal

Snap12

Step 5: Perform the following actions.

  1. Appropriate Virtual Machine Name
  2. Select  ‘TIER’ as ‘Basic’
    1. The Basic tier provides an economical option for dev/test workloads, and other applications that don’t require load-balancing, auto-scaling, or memory-intensive virtual machines. The Standard tier is recommended option for all production workloads.
  3. Choose the ‘SIZE’ as A4
    1. Make sure you choose from Basic
      Snap15
    2. Below high-lighted is the VM configuration I have chosen.
      Azure Virtual Machine Pricing
  4. Provide new Username and Password.

Step 6: Choose appropriate ‘REGION/AFFINITY GROUP’ 

Snap17

Step 7:  Finish the configuration

Snap18

 Step 8: Make sure the VM is up and running

Azure VM is running

Step 9: Download RDP file and connect to the VM
Once the VM status is ‘Running’  you can ‘CONNECT’ using the following highlighted option. Clicking on ‘CONNECT’ will download the ‘RDP’ file, double click and following the screen and log on to the VM. You should be using the same ‘Username’ and ‘Password’ you have used while creating the VM on Azure Portal

Connect to Azure VM

Step 9: Run the PowerShell Scripts 

The VM will be shipped along with few PowerShell Scripts that you need to run, to install and configure SharePoint environment, SQL Server and all the other required software. Trust me! It can’t get simpler than this, running one script spun up the whole SharePoint environment! Neat!

You will find the below highlighted shortcut on the desktop,  where you will find the script ‘ConfigureSharePointFarm.psl’, Run it!

Snap25Once you ran the PowerShell script you will be asked for the ‘localSPFarmAccountName’
and ‘localSPFarmAccountPassword’, enter these and you are all set!

Snap29

Step 10: Finally, you are ready. Remember to smile 🙂

Search for ‘Central’ in the installed apps on the server and you should see the gorgeous Central Admin icon, pin it to the desktop. You are all set, Happy Programming!

Snap30

Note: Remember to turn the VM off, if you are not using it. Remember, you will be charged for every minute for you VM to be available  and running.

0

Introduction to Sitecore – A tour of what’s in the box

Sitecore is THE BEST Web Content Management System (WCM) available on the market. If you in the midst of an RFP (Request for Proposal) for a WCM, I bet you have Sitecore as one of the finalists. I am not surprised! You may want to check the Gartner’s magic quadrant for WCM 2014, Sitecore positioned furthest in Completeness of Vision in Gartner Magic Quadrant for Web Content Management[1]

This blog post is for all those who want have glimpse of what Sitecore is all about. I have worked with 6.5 and 7.0 versions of Sitecore. All the following information is relevant to Sitecore 7.0. Please note that, Sitecore have major enhancements to versions 7.5 and the current version 8.0 (beta at the time of writing of this post). Describing these enhancements is out of scope of this blog post.

If you are starting off a new project with Sitecore, I would highly recommend to go with versions 7.5 and above. Sitecore has invested a lot of time and effort to build a highly scalable, high performance web content management system. And upgrading from version 7.5 to 8.0 is a lot more easier that upgrading from 7.0 or less to 8.0.

Lets get started, following is the overall view of what Sitecore 7.0 offers as products.

SiteCore Overview – Products offered

Sitecore Overview – Products offered

Sitecore is not all about just publishing content onto your websites. Its does a lot more and is packaged with lot of great products and intelligence. Sitecore is built on .NET framework and I am sure Microsoft cannot be more prouder than this, having the best WCM built on their framework.

Following is  the high level summary of these products.  I will try to add more posts on explaining these products in detail.

  • WCM (Web content management): This is the core functionality which does the publishing of the content. and this is best among the products offered.
  • CEP (Customer Engagement Platform):  This platform is dedicated to provide an elegant, integrated solutions that connects channels, engagement automation, and engagement analytics, with external tools and databases
  • DMS (Digital Marketing Suite): The best suite available in the market for advanced online marketing capabilities like Campaign Management, web analytics, visitor profiling etc
  • Solution Accelerators:  These are pre-built extensions for Sitecore like E-Commerce, Intranet platform etc to accelerate your implementation.
  • AppCenter :  This is a portal to sign up for various apps like email delivery etc., spam detect etc

References:

[1] 2014 Gartner Magic Quadrant for Web Content Management

3

Cloud Applications development – It’s NOT all about VMs

Cloud platforms have plethora of options and they come in many flavors. Be it Microsoft Azure, Rackspace, AWS, Oracle or any cloud platform of your choice, all of these are unique and are irreplaceable. But they all share one common offering i.e. Infrastructure as Service (IaaS) and Platform as a Service (PaaS). IaaS is the most basic and widely adapted cloud offering. PaaS is gaining popularity and would be the default choice in near future, but not at the time of writing this blog.

So if you are Cloud Applications developer, it is key that you understand what IaaS/PaaS is and things to bear in mind while developing applications for cloud platforms.

If you assumed that Cloud or Cloud Computing is nothing but a mere collection of millions and millions of VMs/servers remotely hosted and maintained by someone else. Also deployment is all about copying and pasting your code on to those servers! I am afraid you are wrong!

Developing cloud applications is not just mere copy and paste, there is much more to that.

This post will cover few examples that would influence your thought process while developing applications for cloud. Again, there are many more, many more things to consider while developing apps for Cloud platforms.

The following is an honest attempt to influence you to adopt different prospective while you design Cloud apps.

  1. Automation of  Deployments and Hardware Provisioning
    1. There are high chance that you may not have been concerned about hardware provisioning with traditional development. But with cloud application development, it is key that you are aware of automation all your hardware provisioning and deployments.You could leverage Windows Azure Object Model that helps automate lot of  things that are typical with Cloud application development like creation of VMs, Websites, Storage Accounts etc.
    2.  You my leverage C#, PowerShell scripts or popular tools like Chef , Puppet for IT/Deployment automation.
    3. Automation is one of the key design principles of Cloud application development and highly recommended by Microsoft. This will not only minimize errors but also makes it easier to create and deploy items repeatedly.
    4. Example: Consider a scenario where you were requested to create a developer VMs for 20 developers with with Windows Server 2012 R2, SQL Sever R2, IIS 8.0, Sharepoint 2010, MS office and other supporting tools.
      1. What is the first thought you get? I am sure you are tempted to create 20 VMs on Azure portal, RDP onto these machines and install these software. I bet even if you spend like 2 to 3 hours for each VM it would take 60 hours i.e. close to two business weeks.
      2. You might have also thought that, I would create one instance, generalize it and create the rest out of it correct? Note that some software cannot be part of generalization of VM and therefore cannot part of template
      3. Solution: Use scripts or any kind of automation of your choice.
        Write a script to create and spin up these 20 VMs. This can be easily done using Power Shell scripts often with few lines of code.  This could hardly take an hour or two.
        The other issue with software that doesn’t support generalization or bundling with VM images/templates, you could read Bootstrapping a Virtual Machine with Windows Azure, by Michael Washam.
  2.  Budgeting and Cost Conscious
    1. As a traditional developer cost is not something you would consider while developing. But when you are designing for Cloud apps everything come with a price. For instance, your design choice storing your data in-memory vs. database can have huge impact on pricing.
    2. Based on your application design choice you my end up saving a ton or pay the price for not having cost as as important parameter
    3. Cost calculation is as important as application design, development and testing. This should be an integral part of design.
  3. Fluid/Adaptive design choices
    1. The beauty of cloud platforms is you can start with minimal configuration and grow as needed. So your design should be flexible to adapt to changing hardware, it could be reduced CPU utilization or memory allocations or additional servers added to the NLB, it could be anything.
    2. You design choices are critical for Cloud development. For instance, for Session management, you have to prefer Outproc session management to Inproc while using ‘Swap Deployment‘ feature with Azure.What it means is that virtual IPs swap between the staging and production environments for a service. If the service is currently running in the staging environment, it will be swapped to the production. If it is running in the production environment, it will be swapped to staging.So if  are leveraging Inproc session management, you would loose your state, it id ideal that you choose Outproc session management.Please see the below screen capture highlighting ‘Swap’ feature under instances.
      Swap Deployment with Microsoft Azure

      Swap Deployment with Microsoft Azure

       

  4. High availability, Performance and Fault Tolerance
    1. Most of the Cloud providers will have an SLA of 99.5 and above for availability. This is valid only if you provision at least two instance for every service. Developers should be aware of this.
      1. If you choose only one instance, your application may be temporarily unavailable while patching or reboots.
    2. Another important feature that every developer should understand is Affinity Groups.These allow to group your Azure services to optimize performance. All services and VMs within an affinity group will be located in the same region. For instance, it is ideal that you may associate your application and supporting database servers to the same Affinity group to avoid any network latency and increase performance.

Conclusion: It is key to understand that developing Cloud based application development is not same as traditional development. You should start with leaving aside our ego and stop acting that developing for Cloud Platforms, is no different to traditional development. And stop thinking developing for Cloud platforms is all about logging on to a VM, copy-paste, deploy and configure. As a matter of fact, I was under the same impression and attitude when started picking up Azure. Its time to digest the notion that Cloud application development has lot more different in terms of design, development and cost AND IS NOT ALL ABOUT VMs.