Reliability, Network Integration, and System Monitoring Challenges for a Restaurant Menu Website in an Enterprise IT Environment

I am managing a restaurant menu website that is deployed within an enterprise-style infrastructure rather than a typical shared hosting environment. The site is hosted on an internal server stack that integrates with corporate networking components, firewalls, and monitoring tools commonly used in industrial and enterprise IT setups. While the website itself is relatively straightforward in terms of functionality, we are encountering unexpected reliability issues such as intermittent downtime, slow response times, and inconsistent availability across different locations. These issues are difficult to diagnose because they do not consistently reproduce in development or staging environments, and they appear to be influenced by network routing, internal DNS resolution, or security policies enforced at the infrastructure level.

One of the primary challenges involves network segmentation and firewall rules. The website must be accessible both internally (for staff updating menu content) and externally (for customers viewing the menu), which requires carefully configured routing and access controls. However, changes to firewall policies or network zones sometimes result in partial access failures, where the site loads for some users but not for others. In some cases, static assets load correctly while API endpoints or dynamic content fail. This raises questions about how best to design network rules and access paths for a web application that needs to operate reliably across multiple network boundaries without compromising security.

System monitoring and diagnostics are another area of concern. We rely on centralized monitoring tools to track server health, network traffic, and application availability, but correlating website issues with infrastructure events is challenging. Short network interruptions, CPU spikes, or storage latency can cause the site to become temporarily unavailable, yet these events are not always flagged as critical by existing monitoring thresholds. I am looking for guidance on best practices for defining meaningful metrics and alerts for web applications running within an enterprise or industrial IT environment, especially where uptime and responsiveness are important but traditional OT-style monitoring may not be fully aligned with web workloads.

Integration with backend systems also adds complexity. Menu data is periodically synchronized from internal systems that manage pricing, availability, and promotions. These systems are not web-native and were originally designed for internal use, which means data synchronization can be delayed or fail silently. When this happens, the website may display outdated or incomplete information without obvious errors. Designing a reliable data exchange mechanism, with proper validation and fallback behavior, is proving difficult within the constraints of existing enterprise systems and network policies. Any insight into patterns or tools that help bridge modern web applications with legacy or industrial backend systems would be valuable.

Scalability and future growth are also key considerations. The current setup is expected to expand to support multiple locations, each with unique menu variations and update schedules. This will increase load on both the web server and the internal systems supplying data. I want to avoid creating a brittle architecture where small infrastructure changes or system upgrades lead to unexpected outages or data inconsistencies. Understanding how to design a resilient architecture that aligns with enterprise IT standards, while still supporting a responsive and user-friendly website, is a major goal of this project.

Finally, I am seeking advice from the Siemens community on general best practices for operating and maintaining web applications within enterprise or industrial IT environments. Specifically, I am interested in lessons learned around network design, system monitoring, data integration, and long-term maintainability. Any recommendations, references, or real-world experiences related to hosting and managing customer-facing web services alongside enterprise infrastructure would be greatly appreciated. Sorry for long post

Is there anyone who can help me? Please