My analysis
1
In §§3-5 I outline your misconceptions.
In §§6-7 I outline my high-quality recommendations to resolve your problem.
2. Key definitions
ꘖ: AWS RDS SQL Server
A꙳: your application connected to ꘖ
†: your performance problem
L-Sᨀ: LCK_M_S (Shared Lock)
L-Xᨀ: LCK_M_X (Exclusive Lock) or LCK_M_IX (Intent Exclusive Lock)
FKCᨀ: Foreign Key Constraint
߷₁: your screenshot 1
߷₂: your screenshot 2
I↓: INSERT operation in ꘖ
S↑: SELECT operation in ꘖ
RCS⁎: READ_COMMITTED_SNAPSHOT mode in ꘖ
S↑᛭: the S↑, occupying the first row in the «Top SQL» section of ߷₂
T₁: table APILogEntry (the target of I↓ shown in ߷₁)
T₂: table CompSubActivity (the target of I↓ shown in ߷₂)
Tᵢ: a «table with a high rate of concurrent updating» (T₁ / T₂)
Pᵢ: the parent table of T₁ or T₂
M₁, M₂, M₃: your misconceptions (§§3-5)
R₁, R₂: my recommendations (§§6-7)
3. M₁
You mistakenly believe that † is caused by contention among I↓ processes for resources in Tᵢ.
In reality, I↓ requires an L-Xᨀ on Tᵢ, but it never requests an L-Sᨀ on the inserted record itself.
In practice, the appearance of L-Sᨀ during I↓ occurs in only 1 case: when ꘖ attempts to validate a FKCᨀ and cannot read the row in Pᵢ due to an L-Xᨀ held on Pᵢ.
Therefore, the L-Sᨀ wait associated with I↓ (visible in ߷₁ / ߷₂) indicates that I↓ is waiting for permission to read data from Pᵢ (to validate the FKCᨀ), rather than permission to write to Tᵢ.
Consequently, the real bottleneck is not Tᵢ, but Pᵢ, access to which is blocked by other transactions.
Thus, † is caused by the transactions modifying Pᵢ, while Tᵢ is merely a victim of the incorrect (tight) architectural coupling of A꙳ (described in M₃).
4. M₂
You mistakenly view † as a conflict occurring exclusively between I↓ operations, overlooking that in the default ꘖ locking model (with RCS⁎ disabled), reading and writing transactions block each other.
߷₂ shows that the highest load on ꘖ is generated by S↑᛭, which most likely holds locks on Pᵢ or on resources within Tᵢ required by I↓.
If S↑᛭ performs aggregation or reporting on user tables without the NOLOCK hint or enabled RCS⁎, it holds locks that trigger L-Sᨀ waits and block I↓ access to these records, or, conversely, waits for I↓ to complete, creating blocking chains.
L-Sᨀ is characteristic of «Read vs Write» conflicts, rather than «Write vs Write» ones (where L-Xᨀ would dominate).
Consequently, your erroneous attempt to optimize I↓ without eliminating the contention with resource-intensive S↑ (such as S↑᛭) is bound to fail.
5. M₃
Your approach to solving † by optimizing I↓ is fundamentally flawed.
The root cause of † lies in the architectural decision (already mentioned in M₁) to synchronously write high-volume A꙳ logs to tables enforcing FKCᨀ.
The correct, high-quality solutions for † are described in R₁ and R₂.
6. R₁
Enable RCS⁎.
RCS⁎ activates row versioning for read operations.
ꘖ will process S↑ using data versions stored in tempdb instead of placing L-Sᨀ on tables.
7. R₂
Remove FKCᨀ from Tᵢ.
This will eliminate the requirement for ꘖ to place L-Sᨀ on Pᵢ during I↓.