Jump to content

enduser

Members
  • Content Count

    7
  • Joined

  • Last visited

Community Reputation

0 Neutral
  1. Has anyone ever seen the message "Server lost" followed by a GUID in the Windows Event Viewer Genetec log? If so, have you ever got to the bottom of what causes it? We are seeing these, usually at the same time as cameras on one or more archivers fail over to these backup archiver. E Running 5.2 SR6
  2. I'll look into that, but to be honest I would expect an NTP call to feature much earlier in the boot-up sequence than Genetec starting. That said, there is a registry setting: HKLM\Software\Policies\Microsoft\Windows NT\CurrentVersion\Winlogon\SyncForegroundPolicy I have had trouble with this before, it may be worth setting this to 1 to make windows and apps wait for the networking stack to sort itself out before letting apps run.
  3. Apologies for the late response, if there was a "new post" notification I missed it. Genuser, the errors found related to the UDLD protocol, which I believe is related to routing management for the LAN, but I'm not going to pretend I understand the ins & outs of it.
  4. Fingers crossed we have overcome the dropout issues we were seeing. Nothing was "obviously" fixed, but the network team found some errors on the access layer switch used by one of the server stacks, relating to the distribution switch uplinks. These had no obvious cause, they tested the fibre, which came up good, and re-seated the SFPs, and that was all they did (or all they are admitting to ) No dropouts for almost six weeks now.
  5. Hi, > ssmith10pn The SR2 installation was our first Genetec deployment, so this is a clean build, not a migration, cameras, clients, servers and storage are all new, as is the supporting network design. There are no federations. The system currently comprises 2 dedicated directory servers (active & backup) and six archiver servers - each archive server hosts one primary and one backup archiver role. All the servers correspond to the Genetec "Large" specification, so 300Mb/sec capacity. Servers are split across two locations for resilience, each archiver has 2 8Gb FC HBAs connected to a pair of FC switches (Multipath IO enabled) then a stack of SAN storage, and each server uses a single Gig ethernet port for cctv feeds. The SANs are not interconnected. There are only about 100 cameras on line at the moment, a variety of Axis, some oncams, a very small number of areconts. All these are IP cams, and all suffer from sporadic failovers, but only the the oncams exhibit the "5 seconds past" thing. More cameras are due to be integrated as building works progress, currently we are way below capacity, our busiest archiver is only peaking at 150Mb/Sec, the quietest is running about 20Mb/sec. On CPU the servers rarely exceed 15% and have plenty of RAM free, and the underlying OS is 64 bit so no 4GB memory limitations. SAN performance figures are also minimal (in a good way) at this time, no sign of any write queues or high CPU utilisation. The server tier is built for more cameras than currently connected, more cameras are being rolled out as building works progress - for the current camera usage we are very over-specified with servers and storage. Also the server tier is specified to support the whole camera set from one server room if the other goes away for any reason, so under normal utilisation we should not be stressing our servers or storage. The quiet servers (or rather their cameras) seem to fail just as often as the busier ones - we left them "unbalanced" to see if loading was significant in terms of camera dropouts, but it doesn't appear to be. The underlying network is MPLS based, built on CISCO kit, core/distribution/access layered, and CCTV sits in a private VPN subdivided into separate client, server and camera VLANS. Cameras and clients are divided into two VLANS each in order to balance load across the inter-switch uplinkslinks. Only essential traffic is allowed in & out of the VPN (AD authentication, AV updates, that sort of thing) Cameras and servers use fixed IP addresses, with clients on DHCP, and it is IPV4 only. Network engineer reports plenty of spare capacity on the network, and no obvious issues in the switch logs. AntiVirus is installed on the servers, but an exclusion prevents scanning of the Genetec binaries and all the drives used for video storage - I have checked and as far as I can tell this is working as planned, Resource Monitor shows no access to these locations from the AV service. I've not seen anything in SR8 release notes that looks like it might address the issues, but we are considering it regardless. > devsec I won't pretend I understand the implications of your integration suggestion, but will talk it through with somebody who does I'm more from an IT apps and infrastructure background, this is my first foray into CCTV. Losing the dewarping capability would be a no-no, the users like that a lot. To answer your questions: 1) All the camera types have issues, but the Oncam does have this specific "5 seconds past" one all to itself. 2) Sometimes we see gspot errors in the "Genetec" logs in windows event viewer, and sometimes "connection to archiver role lost" messages. These can correspond to camera dropouts, but often dropouts happen without these messages appearing. One thing I would say, is that the actual cctv users seem very happy with the system in general. Some dropouts cause a short loss of recorded footage, but users are not complaining of a loss of live view. Obviously that short outage would be critical if it occurred during some sort of incident though, that's my main concern. Thanks for your interest, any suggestions or queries are greatly appreciated. Neil
  6. Hi, we have had multiple sessions with out local integrator on site and Genetec engineers remotely accessing the system from Canada to investigate the issues, and we are making little progress unfortunately. The installation is about a year old, and has had the problem all along, initially on SR2, and continuing since we upgraded to SR6. SR6 did solve one issue, a memory leak in WMIPRVSE, which reached an internal memory cap, then died and took the archiver with it. WMIPRVSE is actually a windows module, not a Genetec one, but possibly SR6 is using it differently. I've exported months worth of dropout logs into a database so that I could try to spot a pattern, I've looked for patterns in archiver allocation, camera type, vlan, network switch, date and time, pretty much everything I could think of. We have also wiresharked the archivers through a number of dropouts and sent these to Genetec, as well as examining them myself and by the network engineer who designed the local network. The only apparent patterns I can spot are: 1) If I reboot the archivers and directory servers, generally things seem stable for about 2 weeks, then the issues start to reoccur. 2) One particular camera type, Oncam, often fail at 5 seconds past the minute. Sometimes they recover instantly on the same archiver, sometimes they flip to the backup archiver, then flip back 60 seconds layer. In the second case, we see a ping from archiver to camera right on the minute, with no response from the camera. In that case the server sends an rtsp teardown to the camera, which then fails onto the backup server. on the next minute there is another ping, a response is received from the camera and the primary archiver re-initiates the rtsp session. In the first case, there is no obvious cause the wireshark output.
  7. I know this is an old thread, but I was searching through google as we are having problems with a Genetec installation, and the symptoms sound very similar. Cameras drop connections at random times, or flip over to the backup archiver for no obvious reason, connectivity to archiver roles is lost, gspot errors appear in the Genetec event logs. Try as we might, we cannot find an obvious pattern or cause. We are running SC 5.2 SR6 and a mixed bag of IP cameras. We have dedicated directories, and our servers are bumping along the bottom on every indicator at the moment, nowhere near utilising a significant amount of network bandwidth, storage bandwidth, memory or CPU. I would be very interested in hearing from the original poster, or anyone experiencing similar issues. N
×