There can be cases when you face problems like server overloaded or underloaded but requests being rejected by Tomcat (or any other application/ web server). All these issues drill down to incorrect tuning of the server. Here is an interesting case study https://medium.com/netflix-techblog/tuning-tomcat-for-a-high-throughput-fail-fast-system-e4d7b2fc163f
From my personal experience, I found a few important parameters to be considered (specific to tomcat but other servers might have similar values)
maxThreads are actual worker threads which will actually execute the request or perform the requested operations. Setting this up correctly is tricky, as a value too high, means a lot of processing, hence CPU and memory can choke up. On the other hand, a value too low would mean we are not using server capabilities completely but still refusing requests as our all available threads are busy.
maxConnections are connections server is accepting. This will mostly depend on traffic you are expecting.
acceptcount is beyond maxConnections. Any requests which cannot be accommodated as a new connection will wait in a queue whose size is provided by acceptcount. If a request is received beyond acceptcount, it will be rejected by server.
In short, the total number of requests a server can handle at a time is acceptcount + maxconnections. And maxthread are actually threads fulfilling these requests.