At some point, you may run into problems where the SQL Server Windows service will not start or the services are hung in stopping or starting. In this case, many times the chain of events that brought the services to the state they are in was due to a configuration change, security change in the services or even startup parameter changes. One situation that can be found on a restart is configuration changes that may have happened previously and have not taken affect until the restart that is unrelated. In the case of a startup failing, some simple troubleshooting steps can be taken to determine what configuration changes or what issues may be causing the faulting service.
During a service issue recently, this scenario presented itself in a service hung stopping and there was a great need to stop the service and attempt a startup to collect needed information on what the failure was. Typically, memory, security or corruption are the primary suspects.
- Corruption in SQL Server itself causing allocation issues on startup
- Server memory faulting
- Something on the server is consuming all the memory
- SQL Server simply does not have enough memory to start
The steps initially taken to troubleshoot this were pretty direct in the line that you would take, aside from the corruption internally. The first problem that comes into play is the identifying problem of the MSSQLSERVER service being hung in stopping. This is the point we came in to check the problem.
At the point any service is hung, you’ll need to put the service in a state you can work with it. In SQL Server’s case, you’ll need to first start the MSSQLSERVER (or named instance service name) and follow the event of failure through. This is as it would be with any faulting windows service. Start the service, review the event logs after failure and follow that through with SQL Server’s logging in the event of a startup failure – ERRORLOG.
Working with Windows services that are hung in a state of starting or stopping brings in commands SC and TASKKILL. For this case, the initial step to take is to determine the PID of the service and then, decide to kill that task based on the PID. SC is a command that can communicate to the service controller/services and retrieve internal information for state. For this use, SC QUERYEX MSSQLSERVER, is the command needed to retrieve the PID.
With the PID, TASKKILL can be passed in the PID and forcibly stopped completely.
TASKKILL /F /PID 5528
Now that the service is in a stopped state, attempt to start the service so we can identify a clean logging of the failure. The events below are in a chain upon startup. 49910, 17300, 17312, 33086.
Error: 33086, Severity: 10, State: 1. (Params:). The error is printed in terse mode because there was error during formatting. Tracing, ETW, notifications etc are skipped.
Error: 17312, Severity: 16, State: 1. (Params:). The error is printed in terse mode because there was error during formatting. Tracing, ETW, notifications etc are skipped.
Error: 17300, Severity: 16, State: 1. (Params:). The error is printed in terse mode because there was error during formatting. Tracing, ETW, notifications etc are skipped.
Error: 49910, Severity: 10, State: 1. (Params:). The error is printed in terse mode because there was error during formatting. Tracing, ETW, notifications etc are skipped.
All of these relate back to a memory issue. After ruling out faulting physical memory hardware, the ERRORLOG in SQL Server is the next step to determine what information was logged upon startup and failure.
Using startup parameter –f, we can start SQL Server with an absolute minimum configuration needed. This will start SQL Server in single user mode.
Starts an instance of SQL Server with minimal configuration. This is useful if the setting of a configuration value (for example, over-committing memory) has prevented the server from starting. Starting SQL Server in minimal configuration mode places SQL Server in single-user mode. For more information, see the description for -m that follows.
To modify the startup parameters, add one using SQL Server Configuration Manager
I recommend starting SQL Server from command line in order to prevent the lock issues in service.msc from showing issues that may not be really occurring and are simply, MSC locking.
NET START MSSQLSERVER
Using SQLCMD, query with dedicated admin connection, parameter –A, so we can look at the memory settings of the instance.
As shown above, the configured maximum memory setting is only set to 128MB. After reviewing, it came to light that this setting was intended to be 12800, or 12.5GB. The “typo” was made and the individual that made the change assumed SQL Server required a restart from that change. Note, the restart is not required to change max memory settings. It does have problematic performance impacts while memory is managed from the change, but a restart is not required.
When the SQL Server was restarted, the startup process was found, as shown from the ERRORLOG above, to not have enough memory to complete the allocation process and thus, resulting in a FAIL_PAGE_ALLOCATION.
The fix for this specific issue is to simply reconfigure max memory to consume more memory for the pool by using the same admin connection from SQLCMD.
Remove the –f startup parameter and then start SQL Server normally.
Startup issues with Windows services in general can be a complex troubleshooting scenario. Luckily in SQL Server, we have a high level of logging occurring that provides much information on what areas are causing the failures.
Although this error and specific scenario is extremely rare and caused by a human error in configuring SQL Server, it is always important to be reminded of the prerequisites that are set for operating SQL Server.
In particular, the above memory constraints are set for the basis of allocating system operations in SQL Server. In the error covered, the situation that would be caused by going against these prerequisites would place SQL Server into a state of vulnerability.