O365: CDN Change Causes OWA Client Error

NOTE: This post – drafted, composed, written, and published by me – originally appeared on https://blogs.technet.microsoft.com/johnbai and is potentially (c) Microsoft.

Recently, we’ve seen a pattern of escalations wherein users are no longer able to access OWA. Specifically, the error will be similar to the following:

In the details, we’ll see the error we’re concerned with:

X-OQA-Error: ClientError;exMsg=’_g’ is undefined; file=https://pod51048.outlook.com/owa/:362

If we use Network Tracing (F12 in Internet Explorer) [or Fiddler] we’ll see a failure to connect to the CDN:

The connection to ‘r1.res.office365.com‘ failed. <br />Error: ConnectionRefused (0x274d). <br />System.Net.Sockets.SocketException No connection could be made because the target machine actively refused it 23.221.8.9:443

Using ARIN, we can check to make sure that the CDN (Content Delivery Network) we’re trying to hit is owned by Akamai.

The return we’re receiving, highlighted, is because the client cannot load ‘boot.worldwide.0.mouse.js’ from the CDN. This is evidenced by the refusal to connect to the RES server, highlighted.

The root cause of this issue is the configuration of the customer-owned firewall. This is evidenced by the client’s inability to connect to specific CDN and other users are able to use OWA sans issue; notably, because they are connecting to another CDN (‘r4.res.office365.com’, for example).

Since the CDN is controlled by a third-party and we have no way to predict which hosts will be up/in use/referred, our recommendation is to use URL-based filtering. You can read the EHLO blog post (written by our very own Tim Heeney) on the recommendation for URL-based filtering, here: http://blogs.technet.com/b/exchange/archive/2013/12/02/office-365-url-based-filtering-is-just-better-and-easier-to-sustain.aspx

Exchange 2013: Recipient Filtering Agent (More Lessons in Proxy)

NOTE: This post – drafted, composed, written, and published by me – originally appeared on https://blogs.technet.microsoft.com/johnbai and is potentially (c) Microsoft.

With the arrival of Exchange 2013 came some design changes and, with those changes, inevitably something was bound to catch us off-guard. To help explain this better, enter the proxy from the CAS Frontend to the Mailbox backend.

All SMTP connections (we’re focusing on SMTP for this post) utilize the proxy of the Frontend Transport to make an SMTP connection. Even on multi-role servers, the incoming request on 25 utilizes the proxy to 2525 for Transport.

Here is a screenshot over-view of Transport from a previous EHLO blog:

The introduction of protocol proxy isn’t limited to SMTP but we’re focusing on SMTP, as it’s relevant to this post.

 

With the introduction of protocol proxy, there was an introduction of an issue with Transport Agents. Namely, Transport Agents that were installed on a Frontend server (CAS) were actually installed on a backend server. To get around this issue, one had to use local PowerShell and add the Exchange PowerShell Snap-in. However, what wasn’t consider is the following:

 

An Exchange Administrator installs the Transport Agents via PowerShell on the backend server and enables the Recipient Filtering Agent. [This introduces our problem.]

 

User A, external of the organization, composes a message to User B and an invalid recipient, User C, who are internal of the organization. The invalid recipient source isn’t as important as the fact that the invalid recipient exists in the header. When the external mail server connects to the on-premises Frontend, the typical SMTP traffic occurs:

 

  1. EHLO <domain>
  2. Server responds with “Hello <domain>”
  3. MAIL FROM: [email protected]
  4. Server responds with “Sender O.K.”
  5. RCPT TO: [email protected]
  6. RCPT TO: [email protected]
  7. DATA command is sent

 

At this point in the session, the proxy connection is sent to the mailbox server that hosts UserB. The server, seeing that UserB exists responds with Recipient o.k. but, due to tar-pit, this doesn’t quite reach the Frontend, yet. When the Mailbox server receives UserC, it responds with 5.1.1 but this is during the data stream. This causes an NDR to be generated for all recipients of the message. This also means that the valid recipient never receives the email.

 

This is an issue with Exchange 2013 and the fully supported scenario is to only activate the recipient filtering agent on the Frontend server. The Frontend will respond correctly with the “250 2.1.5 Recipient OK” and “5.1.1 Invalid Recipient”, while the correct recipient receives the email and an NDR is generated for the invalid recipient.

O365: Accessing Another Mailbox via OWA URL

NOTE: This post – drafted, composed, written, and published by me – originally appeared on https://blogs.technet.microsoft.com/johnbai and is potentially (c) Microsoft.

It may become necessary for an admin or delegate to access a mailbox (other than their own) in OWA. There’s two ways to do this and most people are familiar with the change in the URL method, which is what I’ll be covering in this post.

In Wave14 (Exchange 2010), you merely had to append the user’s smtp address to the suffix the OWA URL. So, for example, to access Amy Luu‘s mailbox in my test tenant, I would add the following: [email protected]

In Wave15 (Exchange 2013), we merely need to add another character at the end of the URL for this to work: [email protected]/

PowerShell Scripting: EWS and IPM.Configuration.Owa.UserOptions

NOTE: This post – drafted, composed, written, and published by me – originally appeared on https://blogs.technet.microsoft.com/johnbai and is potentially (c) Microsoft.

So, to start off, I should explain how this script came about and why it’s, currently, in a ‘legacy’ status.

There was an issue in Exchange 2010 (that, has since, been fixed), where the save method for the IPM.Configuration.Owa.UserOptions item was in an infinite loop due to corruption. This corruption didn’t come into fruition until long after RTM and SP1. (My thoughts are that SP1 was the introduction of the “bad code”, but it’s not really important when it was introduced.)

What would happen is that the Mailbox Assistants would become flooded/backlogged with events, in the event history table – which is kept in memory. The only signs of reproduction of the issue was that OOFs and RBAs would not fire, in response to emails or meeting requests, respectively. This became a tedious issue to deal with because it also caused other problems, during initial attempts of remediation.

For example, mailbox moves would fail, as the Mailbox Replication Service would successfully move the mailbox but the assistants were still churning away at the mailbox. After the failure, and if the new mailbox on the target database was – now – oriented to the user’s object, the new database (a.k.a.: the target database) would quickly fall into repro.

Restarting the service and having it rebuild the event history table in memory was not a viable avenue for remediation, because the table would rebuild with the infinite corruption loop – once the assistants had read in the event history table, tagged the IPM item change to be processed, and the event was interesting to the assistant[s].

The method we would use to find these users, with the affected IPM item, is to use ExMon to see which user was consuming the most CPU. We could, then, use ExTra to obtain an ETL – targeting the ALL of the assistant flags. Then, we could use a program to port the ETL to CSV and see just what those assistants were chomping down on. [The ExTra method was my preferred method, as it provided two things: significant proof of repro and the user we needed to fix.]

Sadly, our only method of remediation (until SP2 + RU2 + IU) was to copy the user’s configuration settings (Get-MailboxMessageConfiguration) and, then, purge the item from the mailbox store. More often, than not, this was done with MFCMAPI and we would pass a hard delete, to prevent the corrupted item from going into the dumpster.

Then, I got a novel idea: if we use EWS, we can poke into the mailbox store and obtain the item for deletion, since it’s at the root level of the container. And the rest is, as they say, history.

I thought, even though this is legacy, it might prove note-worthy or that someone could learn from the interop of .NET with PowerShell. [All of the EWS calls are in C#.NET.] Unless we have a repro of this particular issue in the nigh to immediate future (for those who are concerned, it’s been over a year since this was fixed), the propensity of it ever being used again is $null. (Little PS humor, for ya, there…)

Delete-IPMConfigurationOWAUserOptions.ps1

Hybrid Configuration Wizard: Exchange server “” was not found. Please make sure you typed the name correctly.

NOTE: This post – drafted, composed, written, and published by me – originally appeared on https://blogs.technet.microsoft.com/johnbai and is potentially (c) Microsoft.

When you decommission an Exchange Server in your Organization and have a Hybrid Configuration, one of the things that does not occur is that the server is not removed from the Hybrid Configuration (msExchCoexistenceRelationship) to reflect this. When you run the Hybrid Configuration Wizard, after the decommission, you’ll see the following error:

The wizard did not complete successfully. Please see the list below for error details.
Exchange server “<Server>” was not found. Please make sure you typed the name correctly.

 

Given the above, you’ll see something similar to the following in Exchange Management Shell for the Hybrid Configuration:

SendingTransportServers   : {UNDERJORDISKE
                            DEL:416ba20c-62c1-4a28-95e5-8dbb6430dc96, LJUSALFER
                            DEL:89bf8a7d-144f-43b8-8aee-b8ac6f98721f}

 

One method to resolve this issue is to do a modification of the object in Active Directory. Using ADSIEdit, browse to the configuration container and either go to the object’s path (Config container > Services > Microsoft Exchange > Organization Name > Hybrid Configuration) or just create a new query (objectClass=msExchCoexistenceRelationship). Once you have the object, look for the property you want to fix (in this case, it was ‘msExchCoexistenceTransportServers’). I removed the two defunct values.

I, then, used Exchange Management Shell to get the DistinguishedName of the current Transport Servers in the Organization:

Get-TransportService | FL DistinguishedName

DistinguishedName : CN=HYLDA,CN=Servers,CN=Exchange Administrative Group (FYDIBOHF23SPDLT),CN=Administrative Groups,CN=minvan,CN=Microsoft Exchange,CN=Services,CN=Configuration,DC=minvan,DC=se
DistinguishedName : CN=STROMKARLEN,CN=Servers,CN=Exchange Administrative Group (FYDIBOHF23SPDLT),CN=Administrative Groups,CN=minvan,CN=Microsoft Exchange,CN=Services,CN=Configuration,DC=minvan,DC=se

I copied these values into the same property I removed the defunct values, previously. After doing so, I re-ran the Hybrid Configuration Wizard.

SysInternals: How Handle Helped Me Figure Out Outlook Was Up to No Good

NOTE: This post – drafted, composed, written, and published by me – originally appeared on https://blogs.technet.microsoft.com/johnbai and is potentially (c) Microsoft.

So, after a lengthy day of some C# battles, a few Exchange issues, and some ded.pixel blaring through the speakers, I went to clean-up some folders from my desktop. I received the dreaded ‘Folder In Use’ message from Windows:

Alright. That’s odd. All that’s in there is two csv files. Excel isn’t open. So, needless to say, I have no readily apparent signifier as to what’s got this folder on lock-down.

Enter handle. Handle is your friend, in these times of need.

O.k., Outlook. Why do you have a handle on the folder? Took a dump of the process of Outlook, closed it, and the file could then delete.

It’s up to the Outlook EEs to figure out why Outlook kept a handle on that file, now.

Testing Responses to ‘GET’ Requests (The Easy Way)

NOTE: This post – drafted, composed, written, and published by me – originally appeared on https://blogs.technet.microsoft.com/johnbai and is potentially (c) Microsoft.

Often times, it may become necessary to look at the response received from a ‘GET‘ request to a web server. We can do a little .NET magic to perform this.

Below is an example tested my favorite TV station in Sweden:

[net.Webrequest]::Create(“http://www.svt.se”).GetResponse()

IsMutuallyAuthenticated : False
Cookies                 : {}
Headers                 : {X-UA-Compatible, X-Varnish, Connection, Content-Length…}
SupportsHeaders         : True
ContentLength           : 50827
ContentEncoding         :
ContentType             : text/html;charset=UTF-8
CharacterSet            : UTF-8
Server                  : Apache
LastModified            : 6/22/2014 8:21:33 PM
StatusCode              : OK
StatusDescription       : OK
ProtocolVersion         : 1.1
ResponseUri             : http://www.svt.se/
Method                  : GET
IsFromCache             : False

You can use the command against any URI you wish to test/verify is up and responsive (which is handy for proving/disproving network issues).

Debugging: When Recursive Becomes Recursively Too Much

NOTE: This post – drafted, composed, written, and published by me – originally appeared on https://blogs.technet.microsoft.com/johnbai and is potentially (c) Microsoft.

Recently, I had a problem with a build environment and, specifically, I was receiving a StackOverflowException.

This is what the exception information looked like from the Application Log perspective:

Faulting application name: <redacted>, version: <redacted>, time stamp: 0x5347902e

Faulting module name: clr.dll, version: 4.0.30319.34014, time stamp: 0x52e0b784

Exception code: 0xc00000fd

Fault offset: 0x00002cf4

Faulting process id: 0x1848

Faulting application start time: 0x01cf815776173165

Faulting application path: <redacted>

Faulting module path: C:\Windows\Microsoft.NET\Framework\v4.0.30319\clr.dll

Report Id: b5a7c204-ed4a-11e3-825f-6c3be5144bba

Faulting package full name:

Faulting package-relative application ID:

Not much to go on, right?

I attached adplus (via pmn) and caught the First Chance Exception and process exit dumps. Using the first chance exception dump, I could see (via the stack back-trace) that thread 0 (1eb8) was well over 7,000 frames and that the same objects were repeating on the thread-stack (!dso).

Using this information, I went to the developers of the internal application and they had a compiled fix for me to test and run.

Needless to say, it’s a lot easier to get work done without the StackOverException haunting me

Attached, you’ll find a (heavily-redacted) text file (sort-of) illustrating the recursion in the thread.

Happy debugging! 🙂

stack.txt

Server2012: Adding DAG Member Results in ‘The fully qualified domain name for node ‘EMEA-SWEEX15-01′ could not be found.’

NOTE: This post – drafted, composed, written, and published by me – originally appeared on https://blogs.technet.microsoft.com/johnbai and is potentially (c) Microsoft.

To get around this, add the permissions by following this article for pre-staging the Cluster Name Object for a Database Availability Group. If you still receive the error, modify the ‘dNSHostName’ attribute to reflect the FQDN of your DAG, as referenced below from my lab (suffix removed from screenshot):

C# + EWS: Autodiscover Test (Exchange and O365)

NOTE: This post – drafted, composed, written, and published by me – originally appeared on https://blogs.technet.microsoft.com/johnbai and is potentially (c) Microsoft.

In times of troubleshooting client-side issues, it may become necessary to query for the autodiscover response the user is receiving from either Exchange on-premises or Exchange in O365 – or, in the case of a redirection, both on-premises and O365 Exchange. This is a sample C#.NET Console Application, which will query for the Autodiscover response and use the TraceListener class to write the response to files.

There are two things the code doesn’t take into account for:

1. The condition wherein the user’s SMTP address and UPN are different.
2. ALL of the possible returns from Autodiscover for the UserSettings.

 

Thus, I have included the the source code for two reasons:

1. Promotion of writing .NET programs for both on-premises Exchange and O365 Exchange.
2. Customization of both the UserSettings one is targeting and the target delivery folder for which the files should be saved.

 

If you have any problems, questions, or concerns, feel free to reach out to me and I’ll try to address them as soon as possible.

The source code can be found here: http://gallery.technet.microsoft.com/C-EWS-Autodiscover-Test-870b4a8e