Login or Sign Up to become a member!
LessThanDot Sit Logo

LessThanDot

Data Management

Less Than Dot is a community of passionate IT professionals and enthusiasts dedicated to sharing technical knowledge, experience, and assistance. Inside you will find reference materials, interesting technical discussions, and expert tips and commentary. Once you register for an account you will have immediate access to the forums and all past articles and commentaries.

LTD Social Sitings

Lessthandot twitter Lessthandot Linkedin Lessthandot friendfeed Lessthandot facebook Lessthandot rss

Note: Watch for social icons on posts by your favorite authors to follow their postings on these and other social sites.

Your profile

    Search

    XML Feeds

    Google Ads

    « Take advantage of Database Mirroring in your applicationsCould not find stored procedure 'sp_ExecuteSQL' »
    comments

    With high availability, the least amount of time data is unavailable, the higher the up-time percentage that is achieved.  Uptime is a period of time in which equipment is available for use by users.  Even a 5 minute failover time can have a negative impact on an uptime goal of 99.999% availability.  Each point of failure that can cause downtime should be considered in the overall equation for uptime.  Since each component has a specific requirement and must be functioning properly to deliver the source to the requestor, this is a way to calculate achievements based on an entire team including server, network, and data.  Calculating the mean uptime or, mean availability, is how this is done.

    Mean Uptime Equation

    The mean uptime equation takes into account any object that can prevent a request from reaching the originating requester.  The object can be a computer, appliance, database server, server, network or storage.  Consider each one of these pieces in planning database availability is critical.  For example, if a database server’s high availability has maintained the five nines but the appliance that maintains availability of the VPN capabilities to reach that database server had a loss of 10% on uptime, the overall achieved uptime should be measured as 90%. Now that does not mean as a database administrator you should be harsh on your overall goals for uptime.  The database server may have still reached a 99.999% uptime.  As such, the goal would be met if the five nines were set as the achievement for the year.  However, we gauge ourselves on the user’s experience, and that requires an equation that consists of the total over infrastructure from point A to point B for total availability.  Point A being the user and point B being the data they request.

    So a better goal other than data availability (point B) would have to take into account major factors that could prevent point B success.  A typical landscape would have user (A), network (B), remote connectivity (C), storage (D), other servers (E) and database server (F).  Storage does not necessarily mean database storage.  This should be any network-related files that ensure business continuity.

    Before writing the equation we can define the expected goal of uptime based on the production uptime percentage formula.

    Uptime minutes / Planned Uptime * 100 = Uptime %

    Planned uptime should always be set forth and well-documented.  This is more relevant if the systems are not a 24 by 7 operation.  Estimating weekends when no connectivity is required could prove to have a drastic negative impact on the overall uptime.  This would involve normal updates to operating systems, database servers, effects from other key system components, and maintenance.  Maintenance can be a gray area so plan well.  If running something like SQL Server Standard Edition, maintenance on indexes can be a factor in overall planned downtime and uptime.  This is due to the time it takes for large indexes to be rebuilt and the data not being available during that time.  Without planning uptime, or planned downtime, the objectives can be nearly impossible to estimate.

    Now that uptime has been defined and explained, add in the remaining components defined earlier that are required for an overall uptime objective.  Let’s assume we planned uptime has been set to allow for 60 minutes of downtime for each component; uptime minutes / 525900 * 100 = Uptime %.  Planned uptime minutes = (365.25 * 24 * 60) - 60 = 525900

    If we achieved 525800 minutes of uptime, this would mean we achieved 99.98% uptime.

    For each component, finding the overall uptime would be done with the following.

    ((uptime minutes / 525900) + (uptime minutes / 525900) + (uptime minutes / 525900) + (uptime minutes / 525900) + (uptime minutes / 525900))/5 * 100 = Complete Uptime %

    If the 525800 was only due to Point B, network outage, this would yield an overall uptime from Point A to Point F of 99.98%.  Let’s say that Point B had 525800 and Point F had 525650. Plug these new overall uptime minutes into the equation.

    ((525800 / 525900) + (525900 / 525900) + (525900 / 525900) + (525900 / 525900) + (525650 / 525900))/5 * 100 = _____%

    Break this down (rounding up)

    0.99980985+ 1 + 1 + 1 + 0.9995246 / 5 * 100 = 99.987

    This achievement tells the team that, together, they met a 99.9% uptime, or three nines.  The three nines is an achievement to be proud of as a team.  The uptime of the database, Point F, of 99.95% may not be that great in the eyes of a database administrator.  But by doing this team mean uptime calculation and reviewing the results, focus points on the data availability can be worked on.

    Conclusion and why we want results

    Overall uptime goals should always be set and measured for both periods and annual results.  These measurements allow bottlenecks to be found and energy to be focused on those bottlenecks for the overall improvement of the mean uptime.  Funding for solutions can be proposed in most cases based on hardened facts and clear results by the numbers of how these calculations are done and the bottlenecks uncovered.    Without setting goals and measuring them, knowing where we stand can lead to excessive downtime due to the point of failures not being uncovered earlier and resolved proactively.

    About the Author

    Ted Krueger is a SQL Server MVP and has been working in development and database administration for 13+ years. Specialties range from High Availability and Disaster / Recovery setup and testing methods down to custom assembly development for SQL Server Reporting Services. Ted blogs and is also one of the founders of LessThanDot.com technology community. Some of the articles focused on are Backup / Recovery, Security, SSIS and working on SQL Server and using all of the SQL Server features available to create stable and scalable database services. @onpnt
    Social SitingsTwitterLinkedInLTD RSS Feed
    398 views
    Instapaper

    No feedback yet

    Leave a comment


    Your email address will not be revealed on this site.

    Your URL will be displayed.
    (Line breaks become <br />)
    (Name, email & website)
    (Allow users to contact you through a message form (your email will not be revealed.)